When NVIDIA announced its latest halo desktop GPU, we had a feeling that it would be fast. For the gaming community, adding features such as more memory, more compute elements, and better ray tracing are a big deal. Since this is STH, we wanted to take a different view of the NVIDIA GeForce RTX 3090, specifically related to compute. With 24GB of memory, the same as theĀ NVIDIA Titan RTX and the promise of more compute resources, we thought this would be a winning combination. It turns out, we were correct.
Video Companion
For this review, we also have an accompanying video companion discussing the impact of the performance we are seeing in this review:
The video looks more at the industry impact of the NVIDIA GeForce RTX 3090 on the GPU compute markets. Our suggestion, given the size of this review, and the video’s subject is a bit different, is to open the video in a separate tab. Note the video came out several hours after the main site article so we added this into the original article.
NVIDIA GeForce RTX 3090 Overview
The card itself is a PCIe form factor in a very loose sense. There is a PCIe Gen4 x16 connector on the bottom of the card, but it now occupies the width of around three PCIe slots. The card does not stop at the top of the rear I/O faceplate nor anywhere near the end of the x16 connector. Instead, it is 313mm x 138mm or 12.3″ x 5.4″ despite needing three PCIe slots of width.
One can see the purpose of this size from the pictures. The card itself is a dual fan unit. Air is no longer directed through the card to the rear of the chassis as we see in blower-style coolers. It is not even directed to the same side of the card as we saw with the Titan RTX. Instead, this is a relatively small PCB surrounded by an enormous, albeit attractive, cooling apparatus.
Above, we can see just how much bigger this card is next to what is probably the most competitive card of the previous generation, the Titan RTX. For some comparison here, the Titan RTX is a $2,500 card while the newer larger RTX 3090 is $1,500.
We are going to let Jensen Huang, NVIDIA’s CEO, show off the other side. Its otherwise attractive design is one that almost perfectly exemplifies why the PCIe form factor for a large card like this is showing its age. The “RTX 3090” is printed on the card assuming you are looking at the card from the top side of the rear angle. If you, for example, have the card situated with the PCIe slot facing down, as is common in deployments and even when we normally take product photos, then the RTX 3090 is upside down. The Titan RTX had a nice solution for this branding need versus a brain’s search for logical order by simply rotating the name 90 degrees.
On the top edge of the card, we have the new 12-pin power connector. Perhaps it is time for a new connector standard, but most will simply be using an adapter Y-cable with this card.
The rear of the unit has four display outputs. There is a single HDMI 2.1 port along with three DisplayPort 1.4a options. The card itself can drive four displays.
It is hard not to appreciate the heatsink design. These bare heatsink fins form attractive patterns, but we will note that we can imagine some will be damaged due to mishandling. Backlighting the card we can see how the PCB is located closer to the rear I/O without opening the card up. The blue light is permeating the right side where the second fan is because there is not a PCB blocking it. If it is not clear from the pictures thus far, that should give a scale of just how much of this design is dedicated to cooling, rather than the GPU, memory, I/O, and power delivery.
We are going to quickly note here that the is a 350W GPU (we will see this later) and also that the card itself is both massive and heavy. For STH readers that build systems for customers, this is one we would be wary of shipping with only a few screws on the rear I/O plate. There is a risk of this being a long and heavy lever inside a system for shipping purposes.
Next, let us take a look at the RTX 3090 FE key specifications and continue with our performance testing.
Are you using the Tensorflow 20.11 container for all the machine learning benchmarks? It contains cuDNN 8.0.4, while the already released cuDNN 8.0.5 delivers significant performance improvements for the RTX 3090.
Great fp64 performance..
It’s not great fp64. The 3090’s AIDA64 GPGPU score of 638 is less than 10% of the 6351 FLOPS my Radeon Pro VII pulls down. https://twitter.com/hubick/status/1324203898949652480
How did the NVlinked, Titan RTXs and Quadro RTX 8000s get better than 100% scaling in OctaneRender 4.0?
Chris Hubick
‘
Misha, as well known AMD shill, is being facetious – and this is a graphics card and not a compute card like the Ampere A100 – which trades the RT cores for FP64…
Would love to see this dataset run on a A100 for comparison, is that review coming as well or are those datasets not public?
Hi,
I LOVE the GeForce and Threadripper compute reviews (especially the youtube video reviews!)
However, for our work load, we really need to know how the hardware performs for double-precision memory-bound algorithms.
The best benchmark that matches our problems (computational physics) is the HPCG benchmark.
Would it be possible to add HPCG results for the reviews? (http://www.hpcg-benchmark.org/)
Also, for some other computational physicists, having the standard LinPack benchmark (for compute-bound algorithms) would be really nice to see as well (https://top500.org/project/linpack/)
– Ron