NVIDIA has four new GPU models based on Ampere launching at GTC 2021, and we are just focusing on the PCIe versions, not the notebook versions. Let us get into the details of the quartet of new GPUs. Please note, this is being written during a keynote so we may update this as more information is released.
NVIDIA A10 and A16 GPUs for Data Centers
The NVIDIA A10 and A16 lack display outputs, so these are more of the data center GPUs. In previous generations, these would have had names such as “NVIDIA Tesla” or “NVIDIA GRID” but those have been retired as NVIDIA blends its data center and workstation lines.
NVIDIA A10
The NVIDIA A10 is a single slot GPU designed to offer an uplift above the current NVIDIA T4 but with a larger and higher power footprint. NVIDIA had a full-height 150W version of the T4 ready that was not released publicly and this seems to follow in that line.
NVIDIA A10 SPECIFICATIONS | |
FP32 | 31.2 TF |
TF32 Tensor Core | 62.5 TF | 125 TF* |
BFLOAT16 Tensor Core | 125 TF | 250 TF* |
FP16 Tensor Core | 125 TF | 250 TF* |
INT8 Tensor Core | 250 TOPS | 500 TOPS* |
INT4 Tensor Core | 500 TOPS | 1000 TOPS* |
RT Cores | 72 |
Encode / Decode | 1 encoder
1 decoder (+AV1 decode) |
GPU Memory | 24 GB GDDR6 |
GPU Memory Bandwidth | 600 GB/s |
Interconnect | PCIe Gen4: 64 GB/s |
Form Factor | 1-slot FHFL |
Max TDP Power | 150W |
vGPU Software Support | NVIDIA vPC/vApps, NVIDIA RTX™ vWS,
NVIDIA Virtual Compute Server (vCS) |
Secure and Measured Boot with Hardware Root of Trust | Yes |
NEBS Ready | Level 3 |
Power Connector | PEX 8-pin |
The * denote sparsity so NVIDIA is getting aggressive with performance claims here.
Overall, this is not going to be the fastest GPU, but if it is a single-slot GPU that is simply needed in some systems or is desirable compared to dual-slot GPUs like the A100’s we saw in our recent ASUS RS720A-E11-RS24U Review.
NVIDIA A16
The second GPU being announced is the NVIDIA A16. This is a 4x Ampere GPU with 16GB of memory per GPU on a single PCIe card. If you saw our NVIDIA GRID M40 with 4x Maxwell GPUs and 16GB RAM cards piece you will see the lineage back to Maxwell.
Feature Type | A16 |
GPUs/board Architecture | 4 GPUs on one board – NVIDIA Ampere |
Memory Size | 64 GB GDDR6 (16 GB per GPU) |
vGPU Software Support | NVIDIA Virtual PC (vPC)
NVIDIA Virtual Applications (vApps) NVIDIA RTX Workstation (vWS) NVIDIA Virtual Compute Server (vCS) |
vGPU Profiles (GB) | 1, 2, 4, 8, 16 |
Media Acceleration | 4x NVENC, 8x NVDEC |
Video Codec (Encode) | H.264/H.265 (+4:4:4) |
Form Factor | FHFL Dual Slot |
Max Power Consumption | 250W |
Graphics Bus | PCIe Gen 4 |
NEBS ready | Yes |
Power Connector | 8-pin CPU |
The primary market for this type of GPU is in the VDI market. One can have smaller GPUs and give VMs physical GPUs or parse out each of these smaller GPUs to multiple users.
NVIDIA A4000 and A5000 GPUs
One of the big differentiators between the A10 and A16 GPUs versus these A4000 and A5000 GPUs is the fact that the A10/ A16 do not have display outputs while the A4000 and A5000 do. We can think of the A4000 and A5000 GPUs as coming from the line formerly called “NVIDIA Quadro”.
NVIDIA A4000
The NVIDIA A4000 is the lower-end GPU of the two. At only 140W and with less memory than the A10, this also has four DisplayPort outputs.
Architecture | NVIDIA Ampere Architecture |
Foundry | Samsung |
Process Size | 8nm |
Transistors | 17.4billion |
Die Size | 392.5 mm2 |
CUDA Parallel Processing cores | 6,144 |
NVIDIA Tensor Cores | 192 |
NVIDIA RT Cores | 48 |
Single-Precision Performance1 | 19.2 TFLOPS |
RT Core Performance1 | 37.4 TFLOPS |
Tensor Performance1 | 153.4 TFLOPS |
GPU Memory | 16 GB GDDR6 with ECC |
Memory Interface | 256-bit |
Memory Bandwidth | 448 GB/s |
Max Power Consumption | 140W |
Graphics Bus | PCI Express 4.0 x16 |
Display Connectors | DP 1.4 (4) |
Form Factor | 4.4” H x 9.5” L Single Slot |
Product Weight | 500 g |
Thermal Solution | Active |
NVIDIA® 3D Vision® and 3D Vision Pro | Support via 3 pin mini DIN |
Frame lock | Compatible (with Quadro Sync II) |
Power Connector | 1x 6-pin PCIe |
NVENC | NVDEC | 1x | 1x (+AV1 decode) |
Perhaps the big one here is that this GPU has an active cooler so it is more aligned to the workstation market versus the A10 and A16 which are more for data centers.
NVIDIA A5000
The NVIDIA A5000 is the bigger of the two GPUs being launched for the workstation market. This has more compute elements, more memory, and higher power consumption than the A4000.
Architecture | NVIDIA Ampere Architecture |
Foundry | Samsung |
Process Size | 8nm |
Transistors | 28.3 billion |
Die Size | 628.4 mm2 |
CUDA Parallel Processing cores | 8,192 |
NVIDIA Tensor Cores | 256 |
NVIDIA RT Cores | 64 |
Single-Precision Performance1 | 27.8 TFLOPS |
RT Core Performance1 | 54.2 TFLOPS |
Tensor Performance1 | 222.2 TFLOPS |
GPU Memory | 24 GB GDDR6 with ECC |
Memory Interface | 384-bit |
Memory Bandwidth | 768 GB/s |
Max Power Consumption | 230W |
Graphics Bus | PCI Express 4.0 x16 |
Display Connectors | DP 1.4 (4) 3 |
Form Factor | 4.4” H x 10.5” L Dual Slot |
Product Weight | 1.025 kg |
Thermal Solution | Active |
vGPU Software Support4 | NVIDIA ® Virtual PC/Virtual Applications (vPC/vApps), NVIDIA RTX® Virtual Workstation
(vWS), NVIDIA Virtual Compute Server (vCS)3 |
vGPU Profiles Supported | See vGPU Pricing & Licensing Guide |
NVIDIA® 3D Vision® and 3D Vision Pro | Support via 3 pin mini DIN |
Frame lock | Compatible (with Quadro Sync II) |
NVLink | 2-way low profile (2-slot and 3-slot bridges)
Connect 2x RTX A5000 |
NVLink Interconnect | 112.5 GB/s (bidirectional) |
Power Connector | 1x 8-pin PCIe |
NVENC | NVDEC | 1x | 2x (+AV1 decode) |
We will note here that virtualization support for the RTX A5000 GPU NVIDIA says will be available in an upcoming NVIDIA virtual GPU (vGPU) release.
A quick note here is that the A5000 also has display outputs and an active cooler like the A4000.
Honorable Mentions
We also wanted to point out that NVIDIA has notebook versions of the A4000 and A5000 GPUs, along with the A3000 and T300/ T1200 GPUs launching today. We do not normally cover these, but we will simply mention them.
Final Words
This is a smart move by NVIDIA. It can release differentiated GPUs into the data center and professional markets thereby selling GPUs at higher ASPs than on the consumer side. Given the global GPU shortage, we expect all of these models to sell well.
Update 2021-04-14: We have a post-GTC 2021 keynote video if you want to hear about some of the other announcements:
I can see it now. A headless Threadripper Pro box using GB or Asus board with seven A10 single slot GPU compute cards. Who says good things don’t come in slim packages?
No replacement for the 2-year old T4 yet?
Any word re A10 and A16 prices?
Why didn’t Nvidia announce those GPUs during GTC 2021 keynote? Also missing is 3080Ti/3070Ti news.
Looking at the A4000, a 6pin connector is a disappointment, it means it is pulling 65W from PCI-e bus, which makes multiple single-slot A4000 a no-no. Even RTX4000 had an 8pin connector.
Meanwhile the A16 is basically 4x low power A4000s on a PCI-e switch. So Nvidia is forcing multiGPU GA104 users to use A16 instead of multiple A4000s. The A16 is a monster gpu for low power inferencing.
Now that I think about it. The A16 could be 4x 128bit unannounced GA107 or it could be 4x 256bit GA104s. Hard to tell at this time. It would be cool if it were 4x GA104s, but 250W TDP tells me it is more likely 4x GA107 with 2x 2gb samsung chips per vram channel
emerth,
or for computer aided tomography 3D in a VM when your main display is attached to a custom board that keeps independent LUTs and gamma for each application as offered by Eizo, Barxo, Totoku / JVC and LG, just to list the 4800 by 3200 px 12.9MP 3:2 screens that I need to fall off the back of a truck and into my desk…
Hi Sales,
Good day.
This is Ivan from Contacthings solution base in Penang.
Our customer is looking for Nvidia A10 GPU, can you quoted me with reseller price, ex-work and share lead time.
NVIDIA A10 GPU Computing Accelerator – 24GB GDDR6 – PCIe 4.0 x16 – Passive Cooler (thinkmate.com)
Thanks,
Ivan