If you’ve ever worked with older titles where multitrack sessions and stems were missing or never existed, you’ve experienced less-than-ideal audio, noise and frustration. Solving this particular problem has been Audionamix's bread and butter since 2003. Started in Paris, this research-focused source separation technology company was initially tasked with helping remove songs from a full mix of a TV show or movie. Audionamix's R&D team took a look at the issue and made it happen. In fact, one of the company’s first real-world use cases was removing the annoying vuvuzela noise from World Cup broadcasts for the French broadcaster Canal+. They also worked on La Vie En Rose and Psycho before expanding to the US, bringing the technology to Hollywood studios.
Audionamix has grown and evolved since then, but their goal remains the same: using technology to help make things sound better. Recently they turned to AI to help make the process more streamlined.
To find out more, we reached out to Audionamix Professional Services senior coordinator and engineer, Stephen Oliver.
Some people might not know that Audionamix has two divisions — Products and Professional Services. Which came first?
Our services were first, helping with otherwise impossible audio tasks for major Hollywood studios and record labels. Of course, we were always asked when our technology would be made public, so they could try it out for themselves, and in 2014 we released our first software product. Happily, our products have been a great success, but that has never negated the work of our Professional Services team. Our work is always evolving in support of our clientele, who are typically in the film and TV production realm, including producers, distributors and major motion picture studios.
Audionamix has embraced AI technology. How are you using it specifically, and how does it support Professional Services? Can it also be found in your products?
We’ve developed Audionamix source separation AI technology to identify and isolate specific audio elements from a full mix, such as dialogue, music, drums and bass. Our team has built advanced AI networks trained on thousands of files that teach the specific sonic characteristics of elements to separate. We use GPU processing so that the complicated separation process is efficient and scalable.
You can also find Audionamix AI in our products, which offer streamlined networks for DJs, musicians, podcasters and dialogue editors.
Can you give us an example of a way you’re using this technology?
In a perfect world, a client would perpetually have access to all the elements or stems of a TV show, movie, album or individual song. But unexpected challenges arise, and archives are often lost or damaged over the years, which of course creates a technical barrier to remastering, streaming distribution and localization for global markets. That’s where our Professional Services team steps in. We combine engineering expertise with the latest advancements in our separation technology to remove this barrier by extracting vocals, dialogue and musical elements to allow for remonetization of classic assets.
What are some of your favorite projects that you’ve worked on from the services perspective? What were some of the challenges and fun parts to those?
A personal favorite was working on the global HD re-release of Baywatch. Re-licensing some of the songs from the original series would have been too expensive, so they decided to replace 350 cues with new, more affordable music.
For the foreign versions of the show, however, separate audio stems were no longer available, making it seemingly impossible to remove and replace just the music, which put the project in jeopardy. Our team removed songs from over 400 clips, preserving the dialogue and effects, and allowing for replacement music to be added. Listening to David Hasselhoff’s Mitch Buchannon in multiple languages was an added bonus!
Rewind to 2010, and we had a very different project — a 5.1 surround sound upmix for the 50th anniversary release of Alfred Hitchcock’s Psycho. At the time, our technology was not as advanced, so we relied on the skills of our team to isolate the audio elements from the mono mix of the film using the software we had available then. It was a difficult project, to say the least, but it served as a solid proof of concept that energized the future development of Audionamix technology.
More recently, we partnered with iFit to dub their trainer-led workout videos into other languages. iFit provides interactive personal training on home exercise equipment, and they wanted to preserve the natural background sound of the videos, but they were recorded using only a single mic. So we stripped the original speech from the videos, leaving the background ambiance. iFit was then able to repurpose these recordings and dub them into multiple languages, while keeping the original sound of the environment.
What can the community expect to see from Audionamix in the future?
One of the key challenges studios face is data security. Many lots work on closed-server systems, and projects can’t leave the lot until release. Considering our AI has been hosted on cloud-based GPUs, this can limit our ability to help certain clients.
We’ve been focused on creating an Enterprise solution to address the needs of those studios. Local Enterprise servers house our most advanced technology for unlimited on-site access. These servers can also be built directly into clients’ online asset management systems, meaning our best technology can be accessed within a studio’s secure infrastructure to eliminate the risk of data leaks.
Of course we’ve got some other things in the works as well that I can’t talk about here. But our clients’ needs are always evolving, and demand to utilize legacy content continues to grow, so our Audionamix team is always working to advance the audio technology space.
|