By Karen Moltenbrey
We all have visions of what life was like in the past, often gleaned from old photographs or video from a particular point in time. The 1911 film A Trip Through New York City, from Swedish company Svenska Biografteatern, provides a journey back through time with eight minutes of jittery, grainy, black-and-white footage showing highlights and snippets of life unfolding in NYC early in the last century. With artificial intelligence (AI), we can now take in those same sights and sounds in color and through a modern lens, as if the film had been shot only yesterday.
This breathtaking video is just one of the historic transformations Denis Shiryaev and the multinational team at neural.love have created using machine learning to complete these unique projects, powered by NVIDIA Quadro RTX. The team is small, only four in all, but the work they do demonstrates big results.
Because of the massive processing power this work requires, Poland-based neural.love initially used a GeForce RTX 2080 Ti card for AI processing and fast realtime raytracing with physically accurate shadows, reflections, refractions and global illumination. Further testing and experience demonstrated that the NVIDIA Quadro RTX 6000 GPU should serve as the GPU processor of choice at neural.love, and it is used to retrain, upgrade and enhance the custom AI film enhancement pipeline. (The NVIDIA Quadro RTX 6000 GPU is available from PNY Technologies, across NALA and EMEAI.)
Built on NVIDIA’s Turing architecture and the RTX platform, the Quadro RTX 6000 card supports AI-enhanced workflows and offers hardware-accelerated raytracing and advanced shading capabilities. It’s a powerful card for this type of AI-based, high-resolution video pipeline.
“With the RTX 6000, we are now able to train and process more data than we ever could before,” says Shiryaev. “Prior to having access to the Quadro RTX 6000, it took us almost a whole week to process one minute of source footage; now we are able to process 60 minutes of footage in less time.”
The Quadro RTX 6000’s massive 24 GB of GPU memory lets neural.love dramatically increase the batch sizes of its training processes. “We are using all the Tensor cores available on this GPU because our projects consume a great deal of processing and computer power,” adds Shiryaev. The RTX 6000 GPU is equipped with 4,608 CUDA cores, 576 Tensor cores and 72 RT cores to complement the enhanced GPU memory capacity.
In addition to using the GPU to continuously train and retrain its neural networks, neural.love is tapping into the RTX 6000 to edit its videos with Adobe After Effects and Premiere Pro software. “This GPU is the core of the entire workflow for my YouTube channel and the work of our entire company,” explains Shiryaev.
How the Video Enhancement Is Done
“Our end goal, after running each frame through our pipeline of neural networks, is to generate 600 individual frames that have been upscaled to 4K resolution,” he says. The group also has the option of applying various other enhancements at this stage, including colorization, deblurring and others.
Once the group has enhanced each frame, they re-assemble them into a complete enhanced video, which can be viewed at 4K resolution and 60 FPS. Even though the number of frames has more than doubled, the length of the video remains the same, however movement within the video appears much more fluid and lifelike.
By transitioning to the NVIDIA Quadro RTX 6000 card, Shiryaev estimates that neural.love is able to complete the process at least 10 times faster. “We never dreamed that a single PC could run several neural networks and process multiple videos at the same time.”
The process of building a neural network involves an extensive amount of source data, which is used to train the neural network. Building neural.love’s pipeline of customized neural networks from the ground up falls to team member Artem Legotin, a machine-learning engineer.
A Trip Through New York City
Shiryaev’s first historical enhancement video was Arrival of a Train at La Ciotat, which took him nearly three weeks to upscale using open-source algorithms. For that video, he used DAIN along with Topaz Gigapixel AI, since he had not yet built a custom pipeline. “I was curious if machine learning could be used in this way,” Shiryaev says. It could. And with great success: The video, posted to Shiryaev’s YouTube channel, has been viewed more than 4 million times, and it has been featured on various media platforms around the world.
Shiryaev continued to experiment with different approaches. A Trip Through New York City was his third project, which followed an upscaling of the Apollo 16 Lunar Rover “Grand Prix”. The goal was to create a more modern and enjoyable viewing experience. The clip was from the YouTube channel of Guy Jones, who had added ambient background noise, which Shiryaev later used in his enhancement.
The footage had many of the standard issues one finds with source material from this era: flickering, variable speeds, low FPS, low resolution, and an assortment of annoying visual artifacts like scratches, burns and other damage.
The team used customized versions of ESRGAN for the resolution upscaling and DAIN for the FPS boost; for colorization, they applied DeOldify, an open-source colorization algorithm based on machine learning. The basic versions of all three of these independent algorithms are publicly available, but some must be trained on a user’s own datasets – an essential prerequisite to producing high-quality enhancements of archival footage. The remaining algorithms were proprietary custom builds.
The group used more than 10 neural networks in its customized pipeline, and they continue to utilize more sophisticated neural network and machine learning techniques in a never-ending quest to provide even better results. By moving to the Quadro RTX 6000 as the preferred processing platform, they realized synergistic hardware benefits, enabling them to accomplish much more than ever before.
Historical Perspective
The technology employed by neural.love brings historical videos to life, making history more accessible to a modern audience. “The current trajectory of this technology is pointing toward a future in which we will be able to provide hyper-realistic engagement with old footage,” says Shiryaev. “We plan to continue experimenting in this field and are extremely curious where it will lead us in the next 10 years.”
This work is indeed history in the making.
To Register for PNY’s Live Webinar: “Film and Video Enhancement with
NVIDIA Quadro RTX Featuring neural.love’s Denis Shiryaev,” click here.
|