By Dug Green
With digital humans charting on the Gartner hype cycle, the eye-popping The Matrix Awakens: An Unreal Engine 5 Experience demo from Epic Games, and the “Aloy” character from Horizon Forbidden West becoming the first digital character to star on a Vanity Fair cover, you could say that our digital counterparts have well and truly taken the stage.
The advancement of real-time game engines and graphic fidelity means the days of two-dimensional game characters are far behind us. This new era of cinematic realism and narratives in the gaming world continue to welcome new generations of astoundingly lifelike digital humans, with each wave the result of the skill and dedication of everyone in the industry. With these highly advanced characters now accepted as the standard, our collective focus has shifted to the finer details of those characters.
Perhaps audiences have a new relationship with game characters entirely. We’re increasingly invested in them, the way we are with on-screen characters played by our favorite actors. We want to connect with them and be affected by their performance in the same way. Gamers and game studios count on gamers being able to escape into transportive modern gameplay without any shattering of disbelief.
Human Facial Perception: Our Universal Superpower
So why are faces such a challenge? A human face is an incredibly complex structure, capable of a seemingly infinite array of expressions, unique to all of us, caused by the movement of muscles that connect to the skin and fascia in the face. Plus, it’s much harder to cross the Uncanny Valley with animation than it is for a photoreal image.
Even though the range and type of facial movement is incredibly diverse by design, the entire spectrum (and every subtlety on that spectrum) is highly recognizable to humans as human. Our universal human superpower since birth has been human facial perception. We’ve even got a specialist area of our brains dedicated to the task. So where facial animation is concerned, not only are the stakes high, but the challenge for the industry is more than considerable.
With the traditional control-rig-based approach to facial animation, it’s becoming more of a challenge and more time-consuming to achieve the fidelity that’s expected now. To reach these greater levels of detail, rigs for in-game animation are reaching the complexity of those used for film visual effects. And as the rigs become more complex, it becomes ever more difficult to implement them for a particular character and then achieve the level of animation “polish” required for modern graphics.
Game-Changing Digital Doubles
It is a fascinating challenge that as technology has advanced, so has the approach. But we always felt that the key to creating the most lifelike human face lay in capturing and using data directly derived from actual human performance.
As opposed to the traditional pipeline of facial capture solutions, we believe the digital doubles route — creating facial animation from precisely capturing a real-life actor’s facial performance — is by far the most effective route for creating the most realistic game characters possible at scale.
It works much the same way you might cast an actor for a stage production or film – the actor might be playing a fictional role, but it is nonetheless his likeness and his performance. Experience and research has shown that we will only ever get close to success by attempting to design, construct and refine a face and facial movement ourselves. But by casting real humans to start with and using DI4D’s performance-driven facial animation services, AAA Game Studios are now removing subjectivity from the equation.
Since 2005, DI4D has been working with game studios on stereophotogrammetry and optical flow-tracking for 4D facial performance capture. 4D facial capture systems were mainly used for research in fields such as psychology and facial surgery before there was an increasing level of interest in using this type of facial capture for next-gen in-game animation. Not only does 4D capture produce much higher-fidelity data than traditional facial mocap, but it also eliminates the need for any markers, makeup and structured light.
Powering today’s game heros via precise human data completely removes risk and subjectivity from the digital doubles equation. What you see in the performance capture shoot is what you get in-game.
Human-Powered Digital Doubles
In Quantum Break, for example, actor Shawn Ashmore appears directly in-game as a digital double, with facial performance data captured from the actor himself, which was used to drive the animation and likeness. The end result is a precise replica of Shawn’s performance.
High-fidelity 4D facial performance data is also behind the eight digital protagonists in Activision’s and Infinity Ward’s cinematic cut scenes for Call of Duty: Modern Warfare.
To achieve the highest level of realism, full-performance capture — sometimes referred to as “pcap” — was used to simultaneously capture body motion and facial performance of the scenes involving the main actors. This performance-capture data was used by Blur Studio to drive digital-double versions of the real-life actors.
We worked to process the stereo helmet-mounted camera data captured during the sessions to obtain the most faithful reproduction possible of the actors’ facial performances before constructing an accurate 3D scan for each frame of stereo HMC data. In this way, studios can include every nuance of a real-life performance for gamers.
Faces of the Future
Previously, the limiting factor to realism of game characters was the power of the engine driving them. Now the limiting factor is the ability to create hyper-realistic content for the engine.
Because full performance capture can now be used in games, the way games are made is changing. In terms of capturing the face and body performances of actors and being able to fully duplicate those performances in-engine, the games industry pipeline is becoming more reflective of that used for film and TV production.
Actors for games can now be cast, shot and captured and their performance faithfully reproduced in-engine and ready for the screen. It seems reasonable to assume that this could lead to new styles of performances and story-driven games. Deeper emotional connection and characters can be developed and more complex narratives opened up.
It’s possible to imagine a future where in-game performances are celebrated and awarded in the same way as those for TV and film are now. With data from real humans now powering game characters and the technology continuously evolving, players are becoming ever more immersed in more sophisticated and characterful cinematic gaming worlds.
For our part, our studio developed a proprietary 4D facial capture solution that is scalable and efficient. This new pipeline allows developers to use high-fidelity 4D facial performance capture to drive high volumes of in-game animation for realistic digital-double characters.
In the future, we won’t even think about facial animation; the exact performance captured will appear in the game. The realism debate will dissipate. It will become art in the same way an actor’s performance is art. This is the age of the digital double, and game developers need to be ready.
Dug Green is co-founder/COO of DI4D, a 4D facial performance capture studio that specializes in animation for digital doubles. DI4D have worked on a host of AAA games projects, including Call of Duty: Modern Warfare, Quantum Break and F12021 Braking Point.
Totally agree with the last paragraph. We are approaching the acceptance threshold for games (Cinema has already rushed past finish) .
We used Dynamixyz stack for our film Adipurush. Now that Dxyz have shut it leaves the market open for innovative players like DI4D.
Cant wait to give their stack a shot on our next project.
Cheers Dug.
Behram Patel
Head Of Technology, Retrophiles