By Cory Choy
Sound plugins are incredible, particularly for audio restoration and especially for AI-powered audio restoration, noise reduction and voice creation. They allow engineers and even day-to-day prosumers to do things that that were either extremely difficult or impossible to do even just a year ago — such as reduce reverb, interpolate and add frequencies that are missing from the recording, dynamically reduce anything that isn’t a voice. And they are fast.
Audio recordings that had been completely unusable are suddenly back in play. If there’s a signal, then there’s now a pretty good chance it can be cajoled into something usable. It’s often as simple as turning a single dial. These voice-filtering and noise-reducing technologies have been developed and popularized rapidly due to the rise of the smartphone, generative AI in the media, the increase in video conferencing and do-it-at-home solutions, advances in miniaturizing and more affordable and powerful processors.
Background
It’s important to think about what people define as “usable” or “preferable” when it comes to audio quality, particularly vocal audio quality. Much like in the ‘80s, when many vocalists and bands were recording in dead-silent rooms without any character (sometimes known as the “empty void”), the first generation of AI audio voice restoration plugins seems to want to bring all voices to a state of reverb-lessness, ambience-lessness and thickness on the low end. Trying to keep things naturalistic is a bit more difficult. I’m not sure if that’s 100% preferable in the long run, but I also wouldn’t be surprised if future versions include convolution reverb modules that allow you to use impulse responses (IRs) to put these voices in almost any environment or space and ambisonics or other models to add perspective and depth.
Additionally, because many of these plugins add or replace sonic data from your original source with data from AI training sets, sometimes they introduce unwelcome, artificial-sounding artifacts into the sounds. Or, if the dial is turned too far, they fundamentally change the nature of the speaker’s voice. The training sets themselves also have a lot of impact on the efficacy of the plugins – making those sets of data every bit as, or perhaps even more valuable than, the plugin code itself. This brings up questions that relate to all generative AI: Will people with certain accents or languages be favored because of accessibility in data sets? What impact will that have on results?
Before this technology came to plugins, it would have taken hours of work, lots of money and specialized equipment to accomplish what these tools can now accomplish in seconds or minutes… if it would have been possible at all. However, I strongly think and continue to maintain that there is absolutely no replacement for a good recording of a good performance. It is still always better to capture something the right way rather than try to process oneself out of a sticky situation.
The Contenders
Before we begin, I would like to say that all of these plugins have astounded me by what they can deliver. That said, now that folks like me are getting past the initial shock of the newly possible, what I aim to do with this comparison is help pros figure out what solution is best for them. The plugins I am reviewing claim to be able to not only remove noise and echo but enhance the signal itself. I will be evaluating:
- Adobe Podcast Enhance Speech (beta)
- Accentize dxRevive Pro
- Hush Pro
- Goyo Supertone (soon to be Voice Clarity)
Not covered in this review:
- Acon Extract:Dialogue
- Waves Clarity
- iZotope RX Advanced, all modules (Note: This application’s “Spectral Recovery” function claims to have some of the same capabilities as the reviewed products but falls short in terms of what it can deliver.)
I decided NOT to include those tools in this review because, unlike the other products, they are more “traditional” denoisers in that they remove extraneous sounds and don’t aim to “enhance” the voice at all. I should add that they are also pretty incredible — competing admirably in a world that used to be completely dominated by Cedar and then iZotope.
Accessibility
Many people in the United States tend to think of the internet, and fast internet, as ubiquitous. And for many post pros, this is generally true. However, I’ve worked on projects in places — both nationally and internationally — where there is either no internet or spotty internet, so it’s important for me to know if I can access the program I’m using without the internet. I also work in Reaper, which is my DAW of choice, so Pro Tools-only plugins are not as valuable to me.
Adobe Podcast Enhance Speech: There is a basic version available for free on the internet. It is a stand-alone app that does not integrate with any DAWs. You upload a file, Adobe processes it and spits back a new file. No options or control. You are limited to 1 hour of content per day, and file lengths are limited to 30 minutes in length. There is also a Premium version available for $9.99 per month that gives you access to a dial for turning the “strength” up and down, and you can process up to 4 hours of tape per day.
Accentize dxRevive Pro: $225 for a VST 3 plugin protected by iLok. It works with pretty much any DAW.
Hush Pro: $249 for a stand-alone app and an AAX for Pro Tools. It only works on a Mac.
Goyo Supertone: This tool is currently in beta for free but will eventually cost $99. The beta needs an internet connection to run, but the “pro” version will be accessible offline. It works with pretty much any DAW.
For me, the winner is dxRevive Pro. While I do not like iLok at all (it has caused me untold annoying problems, though the new cloud system is getting better), dxRevive Pro is the only plugin that works on both PC and Mac without an internet connection. The Hush Pro and Goyo plugins come in a close second. (If I used a Mac only, then Hush Pro would be just fine for me.) Rounding them out is Adobe Podcast Enhance Speech. While the Adobe tool is free, it does require an Adobe login.
Speed and Resources
Adobe Podcast Enhance Speech (beta): Runs in the cloud and runs fast. Most files tend to round-trip in a minute or two, and even long files usually take 10 minutes or less. No local resources are needed.
Accentize dxRevive Pro: I have a pretty fast computer (64GB RAM, SSD, relatively new processor, etc.), and this runs in real time, but is a pretty heavy plugin. I don’t think you’d want to run many instances of this in real time, as it would substantially slow down your session. You can see in my speed test below that the unprocessed .wav file renders at 116x real time on my machine, while the sample .wav file with dxRevive Pro renders at 2.4x real time. That’s much slower.
Hush Pro: This plugin does not run on a PC at all. And while it can run on an Intel Mac, it’s so slow that it’s basically unusable. A 30-minute track took over 8 hours. That said, on the new M1 and M2 Macs, this app is lightning-fast. Renders are as fast or faster than Adobe Podcast Enhance Speech, and they run locally.
Goyo Supertone: This is pretty fast! It’s not as fast as the Hush Pro on newer M1/M2 Macs, but render times clock in at 7.3x real time on my machine — so it’s about 3x less heavy than the dxRevive Pro.
In this case, I would say that there isn’t a clear winner here. If you don’t have a fast computer and are a general consumer, then using someone else’s processors in the cloud can save a whole lot of time, and that’s a point for Adobe. An audio pro like me, however, is going to prefer something that can run in the session. If you are a Mac and Pro Tools user, I would say Hush Pro is the clear winner. If not, then I think Goyo has a slight edge here over dxRevive Pro.
Efficacy
Here’s the big question: What can these plugins do for me in terms of audio restoration?
To find out, I decided to try four different samples of audio to see how the different plugins treated them.
Sample 1: a thin “phone-quality” recording of a well-known actor with a well-known accent
- Adobe Podcast Enhance Speech (beta): It thickens up the voice in the mids in an incredible way, but it actually changes the character of the voice significantly and adds unpleasant artifacting (some sort of fluttering in the low mids). It sounds like a better recording in the end, but the tool changed the voice to a different person. High end and S’s are lacking/eliminated in a bad way.
- dxRevive Pro: Here’s where controls come in handy. There’s a “Phone Restore” preset that sounds pretty good, but the one that worked best, after some fiddling, was Low-End Restore, which gave me more bands to work with. Basically, you can choose frequency ranges to apply different levels of the wet/dry of the plugin. There are also two different algorithms: Studio, which seems more aggressive in terms of adding to the voice, and Retain, which seems to keep the voice as it was and focus on denoise only. Ultimately, the voice is beefed up and retains more of its original characteristics, but there are still some artifacts added.
- Hush Pro: This tool also beefs up the low-mids a little, but not enough, and adds a little artifacting when cranked too high. When cranked too low, it doesn’t do enough to restore. Great for noise reduction, but it doesn’t add enough to the voice.
- Goyo Supertone: This plugin doesn’t add to the low/mid range of the voice at all. Great at eliminating room noise.
Winner: dxRevive Pro for sure.
Sample 2: a clear, “social media-quality” voice memo recording in a reverberant room with a constant hum
- Adobe Podcast Enhance Speech (beta): It does a great job of eliminating noise and hum and reducing long-tail reverb. Cuts off esses from the original voice and degrades the quality, making it feel a little dull and muddy.
- dxRevive Pro: The Retain mode reduces noise and hum really well, but, as intended, it doesn’t address reverb. The Studio mode hits the reverb well but ends up taking off too much ess on the high end if you’re not careful.
- Hush Pro: It does a great job of eliminating noise and hum. The voice retains characteristics and sounds good. Long echo is addressed, but not short echo.
- Goyo Supertone: It also quickly eliminates noise, but it needs to be cranked to hit the hum. Voice retains characteristics, but there’s no reduction in reverb, despite my cranking the “reverb” knob all the way down.
Winner: I would say it’s Hush Pro with a slight edge over dxRevive Pro and Goyo Supertone. Adobe is the clear loser.
Sample 3: a voice recorded in a buzzy space with a damaged microphone
- Adobe Podcast Enhance Speech (beta): Honestly, I’m disappointed. The high end does NOT sound better, maybe even slightly worse, and the esses are adversely affected/removed. Noise and buzz get eliminated, but I’m not happy with the vocal quality. Artifacts introduced.
- dxRevive Pro: The mid/high ends sound better. The tool doesn’t add to the esses where I would like it to — in general, this is something I’ve noticed that all of these plugins could do better (provide an option to re-ess) — but things sound noticeably clearer. Noise and buzz are eliminated, and verb is reduced. Very impressed.
- Hush Pro: Voice sounds mildly better, but not much – it’s still dull. Noise and buzz are eliminated, and there are no noticeable artifacts.
- Goyo Supertone: It eliminates noise and buzz, and the voice sounds slightly better when boosted, but verb is still a problem.
Winner: dxRevive Pro is the clear winner again.
Sample 4: a professionally recorded voice in a relatively noisy space, where a door slam occurs while the speaker is speaking
- Adobe Podcast Enhance Speech: Eliminates noise but does nothing for the door slam.
- dxRevive Pro: Eliminates noise, significantly reduces door slam. Very impressive!
- Hush Pro: Eliminates noise, reduces door slam, but not as much as dxRevive Pro.
- Goyo Supertone: Eliminates noise, but it does nothing for the door slam.
Winner: DxRevive is the clear winner here again.
Conclusions
When it comes to heavy-duty restoration and enhancement, I would say that dxRevive Pro really impressed me. It does things that the other plugins can’t, and it is more adjustable. When it comes to ease of use and retaining vocal quality with lighter noise issues, I would say that Goyo Supertone and Hush Pro really shine because they don’t introduce artifacts the way the other two plugins sometimes do. But I don’t like that Hush Pro is Mac-only. All of the plugins are impressive in the way they reduce/eliminate long reverb — and Adobe is currently free! — though Goyo in particular is a little weaker than the others with verb reduction. Short echo and reverb remain my white whale — none of these plugins completely nail it. All of these solutions are incredibly affordable. Remember that they are competing with and in some ways surpassing Cedar DNS, which costs several thousand dollars.
The possibilities for noise reduction and restoration are truly astounding. One thing to keep an eye on, though, is what happens to the rest of the sonic landscape when we so thoroughly isolate the voice.
Cory Choy is an Emmy Award-winning sound mixer, audio restoration specialist and owner of Silver Sound in New York City. He recently won the Audio Storytelling category at Tribeca with his piece Aisha.
I have to agree I was very impressed with DxRevive Pro. So much so I bought it and I own RX.
As with all these things it’s easy to overdo it, but personally I’d stay away from Adobe.