Since we featured these recently, we wanted to show folks a unique card. The NVIDIA A16 is NVIDIA’s latest 4x GPU card for the data center. These types of cards are generally used for VDI deployments. Since many have never seen one, we figured we would show what one looks like.
NVIDIA A16 with 4x Ampere 16GB GPUs Onboard
The NVIDIA A16 is a double-width card. From the front, it looks unremarkable as a flat surface to improve airflow.
For some sense of what is under the large heatsink at this point, here is the diagram of how the four GPUs are placed.
Here is the back of the card. We like that NVIDIA has a backing here since it helps keep cards safe.
Here is the PCIe connector side. Notice the notch for the GPU retention in a PCIe x16 slot.
The card itself is mostly heatsink and cooling in the double-width form factor. The PCB with the GPUs is only a small portion of the card. One will notice, there are no display outputs on this which is standard in this NVIDIA series.
Here is the rear of the unit. We can see the data center 8-pin power connector and also mounting holes for GPU supports.
This has come a long way since the NVIDIA GRID M40 that we looked at almost seven years ago when we procured some cards from HPE machines that Facebook decommissioned.
The general layout has remained constant for almost a decade at this point.
NVIDIA A16 Key Specs
With that, it is time to look at the key specs for the card:
GPU Architecture |
NVIDIA Ampere architecture |
GPU memory |
4x 16GB GDDR6 |
Memory bandwidth |
4x 200GB/s |
Error-correcting code (ECC) |
Yes |
NVIDIA Ampere architecture- based CUDA Cores |
4x 1280 |
NVIDIA third-generation Tensor Cores |
4x 40 |
NVIDIA second-generation RT Cores |
4x 10 |
FP32 | TF32 | TF321 (TFLOPS) |
4x 4.5 | 4x 9 | 4x 18 |
FP16 | FP161 (TFLOPS) |
4x 17.9 | 4x 35.9 |
INT8 | INT81 (TOPS) |
4x 35.9 | 4x 71.8 |
System interface |
PCIe Gen4 (x16) |
Max power consumption |
250W |
Thermal solution |
Passive |
Form factor |
Full height, full length (FHFL) Dual Slot |
Power connector |
8-pin CPU |
Encode/decode engines |
4 NVENC/8 NVDEC (includes AV1 decode) |
Secure and measured boot with hardware root of trust for GPU |
Yes (optional) |
vGPU software support |
NVIDIA Virtual PC (vPC), NVIDIA Virtual Applications (vApps), NVIDIA RTX Virtual Workstation (vWS), NVIDIA AI Enterprise, NVIDIA Virtual Compute Server (vCS) |
Graphics APIs |
DirectX 12.072, Shader Model 5.172, OpenGL 4.683, Vulkan 1.183 |
Compute APIs |
CUDA, DirectCompute, OpenCL™, OpenACC® |
MIG support |
No |
In the spec chart, there are many “4x” and in the case of NVDEC an “8x”. That is because this card is a single physical PCIe device, but it has four Ampere generation GPUs. That makes the card difficult to explain to some folks. One way one could explain it is that it is a 64GB PCIe card, but each GPU only has access to 16GB of memory.
Another spec that is notable is the lack of MIG support. MIG is the technology that allows one to split a single NVIDIA A100, for example, into up to seven GPU partitions. We have shown this in a few reviews at this point. This card is almost like the anti-MIG since it has multiple physical GPUs.
NVIDIA A16 Running
Taking a look at this in the recent 16x NVIDIA GPU 128 Core Arm Server Supermicro ARS-210M-NR article, we can see the GPUs. This system has a single CPU, but it has four of the NVIDIA A16’s installed. The A16’s show as individual PCIe devices that are grouped here. Each with 10 compute units and 14GB in the topology report (there is 16GB for each card though.)
Here is the nvidia-smi output. With only four PCIe double-width cards installed, we now get 16x Ampere generation GPUs and a total of 256GB of video memory.
Performance-wise, the four GPUs are relatively close to the NVIDIA A2 GPU. The memory subsystem with 128-bit wide 16GB 200GB/s GDDR6 and 6.25GHz and the four GA107-890 Ampere GPUs. While the A2 is designed to run at a lower TDP, it still occupies a PCIe slot. The NVIDIA A16 is designed to put more accelerators onto a single card to improve density.
Final Words
The NVIDIA A16 is, in many ways, the opposite of what NVIDIA has been pushing for years. It has four physical GPUs rather than relying on MIG to parse out larger silicon. Still for VDI deployments and also for those applications that need higher density NVENC/ NVDEC options, this is a very unique part.
Overall the NVIDIA A16 is an interesting card. It reminds us of the “old days” at STH working with these cards when they were still very novel solutions. Now with the NVIDIA A16, this is a product line that has lasted for almost a decade.