Review: Nvidia’s GeForce 4090

By Mike McCarthy

As the first product coming to market featuring Nvidia’s new Ada Lovelace architecture, the GeForce 4090 graphics card has a host of new features to test. With DLSS3 for gaming, AV1 encoding for video editors and streamers, and raytracing and AI rendering for 3D animators, there are new options available for a variety of different potential users. While the GeForce line of video cards has historically been geared toward computer gaming, Nvidia knows that the cards are also valuable tools for content creators, and a number of new features are designed especially for those users.

Like its predecessor, the Ampere-based 3090, the new GeForce 4090 Founder’s Edition card is a behemoth, taking up three PCIe slots and exceeding the full-height standard, so you will need a large case. While at first glance it looks like Nvidia just dropped the new card into the previous-generation shroud and cooling solution, upon closer inspection it becomes obvious that’s not true. While Nvidia used the same overall design approach, the company adjusted and improved it in nearly every way.

Most significantly, the 4090 is ¼ inch shorter than the 3090, which is important because I have repeatedly found the 3090 card to be slightly too long for a number of cases — to the point of scratching the edge of my card to fit into spots that were too tight. The shorter length allows Nvidia to use larger fans on the card (116mm instead of 110mm), which are now counter-rotating, with fewer blades and with better fluid dynamic bearings. This is all designed to increase airflow and minimize fan noise, which I believe Nvidia has succeeded in doing.

Power Requirements
You will also need a large power supply to support the card’s energy consumption, which peaks at 450 watts. Nvidia recommends at least an 850W power supply, so my Fractal Design 860W unit just barely qualifies. All the new cards use a new 12-pin PCIe 5.0 power connector, which is similar to the connector on the Ampere cards but with slight differences, including additional signaling pins.

While the previous card came with an adapter to direct the current from two 8-pin plugs into the card, the new cards include an adapter to harness the power from up to four 8-pin PCIe power plugs, although only three are absolutely required. The lower-tier GeForce 4080 cards will have lower power requirements but use the same plug. Eventually, this new plug should become a standard feature on new high-end power supplies, simplifying the process of powering the card.

The one thing I don’t like about the new plug on the cards is that because it sticks directly out of the top of the card, it further increases the minimum case size rather dramatically. I imagine Nvidia did this because having the plug come out the end of the card would interfere with hard drive bays and cooling systems in many cases. I also have an issue because the underlying printed circuit board doesn’t extend past the current connector location. This means you need a larger-volume system chassis, which should increase air-cooling efficiency. The card fits well in my case because I was prepared for it after the 3090 didn’t fit in my main system two years ago, but I don’t think the current power cable solution is very elegant or ideal.

The new cards still use a 16-lane PCIe 4.0 interface because they don’t yet saturate the available bandwidth of that connection to justify the expense of using the emerging PCIe 5.0 standard that is available on new motherboards. The only time I could imagine that being an issue is when the connection is being shared between two GPUs in two 8-lane slots, but that multi-card approach to increasing performance is falling out of style with consumer systems. Part of the reason for that is because of the complexity of implementing SLI or Crossfire at the application level. But more significantly, individual GPU performance scales much higher than it used to.

Similar to current CPU options, the high-end options now scale to much greater performance than most users will ever be able to fully use. To that point, there has been no mention of using NVLink or similar technologies to harness the power of multiple 40 Series GPUs. This removes most of the need to harness the power of multiple separate chips or cards to increase performance, simplifying the end solution. And the new GPU options scale up very high, with the new Ada Lovelace-based chips inside this new card being up to twice as powerful as the previous Ampere chips.

The ADA Lovelace Chip
The GeForce 4090 is the current top product in the new lineup, with 16,384 CUDA cores at 4nm and running at 2.5Ghz (which greatly exceeds the previous-generation GeForce 3090’s count of 10,496 cores maxing out at 1.7Ghz). The 4090 has over three times as many transistors at 76 billion and 12 times the L2 cache of the previous version at 72MB. The memory configuration is similar to the 3090, with 24GB running at 1TB/s, but it now uses lower-power memory chips. The other big change since the last generation is the 8th-generation NVENC hardware video encoder, which now supports AV1 encoding acceleration. And now, with dual encoders operating in parallel, content up to 8Kp60 can be encoded in real time for high-resolution streaming.

More details on the AV1 encoding support are below in the video editing section.

New Software and Tools
Users can unlock many of the newest functions available with the Ada hardware through the new software developments Nvidia has been making. The biggest one, most relevant to gamers, is DLSS 3, which stands for deep learning super sampling.

DLSS 2 used AI Super Resolution to decrease the number of pixels that needed to be rendered in 3D by intelligently upscaling the lower-resolution result. DLSS 3 takes this a step further, using AI-based Optical Multi-Frame Generation to generate entirely new frames displayed between the existing rendered ones. This process is hardware-accelerated by Ada’s fourth-generation Tensor cores and a dedicated Optical Flow Accelerator. DLSS 3 uses AI-generated interpolated frames to double the frame rate, even for CPU-bound games like MS Flight Sim. With both optimizations enabled, seven of every eight on-screen pixels were generated by the AI engine, not the 3D rendering engine. This does lead me to wonder: Does it scale the frame or double the frame rate first? I am going to guess it first doubles the frame rate so there is less total data to sort through for the interpolation process, but I don’t actually know for sure.

Nvidia’s other software advances are applicable to more than just increasing computer game frame rates. Nvidia Canvas is a new, locally executed version of Nvidia’s previously cloud-hosted GauGAN application. Now there is a wider variety of controls and features, and everything is processed locally on your own GPU. It is available for free for any RTX user.

Nvidia Remix is Nvidia’s toolset for the game-modding community. It uses some very innovative approaches and technology to add raytracing support to older titles, while other hardware-level approaches capture and export geometry and other 3D data to Nvidia’s Omniverse’s USD format, regardless of its source type. While I don’t do this type of work, I do play older games, so I am looking forward to seeing if anyone brings these new technologies to bear on the titles I play.

In addition, Nvidia Broadcast was introduced two years ago to use AI and hardware acceleration to clean up and modify, in real time, the audio and video streams of online streamers and even teleconference participants. It uses GPU hardware to do background noise removal and processing, visual background replacement and motion tracking of computer microphone and webcam data streams. Once again, via some creative virtual drivers, Nvidia can do it in a way that is automatically compatible with nearly every webcam application.

Real-World Performance
I have provided a lot of details about this new card and the chip inside of it, but that still leaves the question: How fast is this card? For gaming, it offers more than twice the frame rates in DLSS 3-supported applications, like the upcoming release of Microsoft Flight Simulator. I was able to play smoothly at 8K with the graphics settings at maximum and got over 100fps in 4K with AI frame generation enabled. The previous flagship 3090 allowed 8K at low-quality settings and about 50fps at 4K with maximum graphics settings, so this is a huge improvement.

For content creation, the biggest improvements will be felt by users working in true 3D. The Blender UI feels dramatically different with the 4090 than it does with the 3090’s pixelated free view when full render is enabled for the viewport.

Both the Blender and Octane render benchmarks report double the render performance with the 4090 compared to the previous 3090. That is a massive increase in performance for users of 3D applications that can fully use the CUDA and OptiX acceleration.

For video editors, the results are a little less clearcut. Blackmagic DaVinci Resolve has a lot of newer AI-powered features that run on the GPU, so many of these functions are about 30% faster with the newer hardware. This could be significant for users who frequently use tools like cut detection, auto framing, magic mask or AI speed processing. This performance increase is in addition to the new AV1 encoding acceleration in NVENC, which will significantly speed up exports to that format.

The improvements in Adobe Premiere Pro — where AV1 acceleration can be harnessed via the upcoming Voukoder encoding plugin — are much more subtle. But most Adobe users won’t see huge performance or functionality improvements with these new cards without further software updates, specifically updates that allow native import and export of AV1 files.

AV1
AV1 is a relatively new codec that is intended to improve upon and, in many cases, replace HEVC. It produces higher-quality video at lower bit rates, which is important to those of us with limited internet bandwidth. And it comes with no licensing fees, which should accelerate support and adoption. The only real downside to AV1 is the encode and decode complexity. That’s where hardware acceleration comes into play.

The 30 Series Ampere cards introduced support for accelerated AV1 decoding, which allows people to play back AV1 files from YouTube or Netflix smoothly. But the GeForce 4090 card is the first with the eighth-generation Nvenc engine, which now supports hardware-accelerated encoding of AV1 files. This can be useful for streaming applications like OBS and Discord, for renders and exports from apps like Resolve and Premiere, and even for remote desktop tools like Parsec — all of which run on NVENC.

I for one am looking forward to the improved performance and possibilities that AV1 has to offer as it gets integrated into more products and tools. At higher bitrates, AV1 is not significantly better than HEVC, but at lower bitrates it makes a huge difference. This review was my first hands-on experimentation with the codec, and I found the 5Mb data rates were sufficient to capture UHDp60 content, while 2Mb was usually enough for the 24fps UHD content I was rendering. I would usually recommend twice those data rates for HEVC encoding, so that is a significant reduction in bandwidth requirements.

The future is here.

Summing Up
If you are doing true 3D animation and rendering work in an application that supports raytracing or AI denoising, then the additional processing power in this new chip will probably change your life. For most other people, especially video editors, it is probably overkill. A previous-generation card will do most of what you need at what is hopefully now a lower price. But if you need AV1 encoding or a few of the other new features, it will probably be worth it to spring for the newest generation card — just maybe not the top one in the lineup.

For those who want the absolute fastest GPU available in the world, there is no doubt that this is it. There are no downsides. It is only a matter of whether you can justify the price and whether it will fit in your system. But it is screaming-fast and full of new features.

Mike McCarthy is a technology consultant with extensive experience in the film post production. He started posting technology info and analysis at HD4PC in 2007. He broadened his focus with TechWithMikeFirst 10 years later.

Your Online Magazine for All Things Production & Post

Review: Nvidia’s GeForce 4090

Related

Leave a Reply Cancel reply

Share this:

Related

Leave a Reply Cancel reply