NVIDIA A16 with 4x Ampere 16GB GPUs Onboard Quick Look

0
NVIDIA A16 4x 16GB GPU Name 2
NVIDIA A16 4x 16GB GPU Name 2

Since we featured these recently, we wanted to show folks a unique card. The NVIDIA A16 is NVIDIA’s latest 4x GPU card for the data center. These types of cards are generally used for VDI deployments. Since many have never seen one, we figured we would show what one looks like.

NVIDIA A16 with 4x Ampere 16GB GPUs Onboard

The NVIDIA A16 is a double-width card. From the front, it looks unremarkable as a flat surface to improve airflow.

NVIDIA A16 4x 16GB GPU Front
NVIDIA A16 4x 16GB GPU Front

For some sense of what is under the large heatsink at this point, here is the diagram of how the four GPUs are placed.

NVIDIA A16 Four GPU Diagram
NVIDIA A16 Four GPU Diagram

Here is the back of the card. We like that NVIDIA has a backing here since it helps keep cards safe.

NVIDIA A16 4x 16GB GPU Back
NVIDIA A16 4x 16GB GPU Back

Here is the PCIe connector side. Notice the notch for the GPU retention in a PCIe x16 slot.

NVIDIA A16 4x 16GB GPU Bottom PCIe Connector
NVIDIA A16 4x 16GB GPU Bottom PCIe Connector

The card itself is mostly heatsink and cooling in the double-width form factor. The PCB with the GPUs is only a small portion of the card. One will notice, there are no display outputs on this which is standard in this NVIDIA series.

NVIDIA A16 4x 16GB GPU IO Plate Side
NVIDIA A16 4x 16GB GPU IO Plate Side

Here is the rear of the unit. We can see the data center 8-pin power connector and also mounting holes for GPU supports.

NVIDIA A16 4x 16GB GPU Rear With Power Input
NVIDIA A16 4x 16GB GPU Rear With Power Input

This has come a long way since the NVIDIA GRID M40 that we looked at almost seven years ago when we procured some cards from HPE machines that Facebook decommissioned.

NVIDIA GRID M40 front and back GPUs highlighted
NVIDIA GRID M40 front and back GPUs highlighted

The general layout has remained constant for almost a decade at this point.

NVIDIA A16 Key Specs

With that, it is time to look at the key specs for the card:

GPU Architecture

NVIDIA Ampere architecture

GPU memory

4x 16GB GDDR6

Memory bandwidth

4x 200GB/s

Error-correcting code (ECC)

Yes

NVIDIA Ampere architecture- based CUDA Cores

4x 1280

NVIDIA third-generation Tensor Cores

4x 40

NVIDIA second-generation RT Cores

4x 10

FP32 | TF32 | TF32(TFLOPS)

4x 4.5 | 4x 9 | 4x 18

FP16 | FP16(TFLOPS)

4x 17.9 | 4x 35.9

INT8 | INT8(TOPS)

4x 35.9 | 4x 71.8

System interface

PCIe Gen4 (x16)

Max power consumption

250W

Thermal solution

Passive

Form factor

Full height, full length (FHFL) Dual Slot

Power connector

8-pin CPU

Encode/decode engines

4 NVENC/8 NVDEC (includes AV1 decode)

Secure and measured boot with hardware root of trust for GPU

Yes (optional)

vGPU software support

NVIDIA Virtual PC (vPC), NVIDIA

Virtual Applications (vApps), NVIDIA RTX Virtual Workstation (vWS), NVIDIA AI Enterprise, NVIDIA Virtual Compute Server (vCS)

Graphics APIs

DirectX 12.072, Shader Model 5.172,

OpenGL 4.683, Vulkan 1.183

Compute APIs

CUDA, DirectCompute, OpenCL, OpenACC®

MIG support

No

In the spec chart, there are many “4x” and in the case of NVDEC an “8x”. That is because this card is a single physical PCIe device, but it has four Ampere generation GPUs. That makes the card difficult to explain to some folks. One way one could explain it is that it is a 64GB PCIe card, but each GPU only has access to 16GB of memory.

Another spec that is notable is the lack of MIG support. MIG is the technology that allows one to split a single NVIDIA A100, for example, into up to seven GPU partitions. We have shown this in a few reviews at this point. This card is almost like the anti-MIG since it has multiple physical GPUs.

NVIDIA A16 Running

Taking a look at this in the recent 16x NVIDIA GPU 128 Core Arm Server Supermicro ARS-210M-NR article, we can see the GPUs. This system has a single CPU, but it has four of the NVIDIA A16’s installed. The A16’s show as individual PCIe devices that are grouped here. Each with 10 compute units and 14GB in the topology report (there is 16GB for each card though.)

Supermicro ARS 210M NR With 16x NVIDIA A16 GPUs And 128 Arm Cores Lstopo
Supermicro ARS 210M NR With 16x NVIDIA A16 GPUs And 128 Arm Cores Lstopo

Here is the nvidia-smi output. With only four PCIe double-width cards installed, we now get 16x Ampere generation GPUs and a total of 256GB of video memory.

Supermicro ARS 210M NR With 16x NVIDIA A16 GPUs And 128 Arm Cores
Supermicro ARS 210M NR With 16x NVIDIA A16 GPUs And 128 Arm Cores

Performance-wise, the four GPUs are relatively close to the NVIDIA A2 GPU. The memory subsystem with 128-bit wide 16GB 200GB/s GDDR6 and 6.25GHz and the four GA107-890 Ampere GPUs. While the A2 is designed to run at a lower TDP, it still occupies a PCIe slot. The NVIDIA A16 is designed to put more accelerators onto a single card to improve density.

Final Words

The NVIDIA A16 is, in many ways, the opposite of what NVIDIA has been pushing for years. It has four physical GPUs rather than relying on MIG to parse out larger silicon. Still for VDI deployments and also for those applications that need higher density NVENC/ NVDEC options, this is a very unique part.

NVIDIA A16 4x 16GB GPU Name 2
NVIDIA A16 4x 16GB GPU Name 2

Overall the NVIDIA A16 is an interesting card. It reminds us of the “old days” at STH working with these cards when they were still very novel solutions. Now with the NVIDIA A16, this is a product line that has lasted for almost a decade.

8x NVIDIA GRID M40 Installed 2 - STH
8x NVIDIA GRID M40 Installed

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.