Aivres KR6288 NVIDIA HGX H200 Server Review

3

Aivres KR6288 Internal Overview CPU and NICs

The top of the server looks like a very dense 2U platform in many ways. In the photo below, we will work from right to left. The airflow guide over the CPUs is quite sturdy.

Aivres KR6288 NIC CPU Memory Area 3
Aivres KR6288 NIC CPU Memory Area 3

A fun detail is that the airflow guide labels the CPUs and the memory slots below it.

Aivres KR6288 Airflow Guide DIMM Window Label
Aivres KR6288 Airflow Guide DIMM Window Label

Removing that airflow guide, we can see the system.

Aivres KR6288 NIC CPU Memory Area 1
Aivres KR6288 NIC CPU Memory Area 1

Between the motherboard and the front storage is a set of six fan modules. We left the cover on top of these to help ensure the platform’s structural rigidity since it is not made to sit on its side.

Aivres KR6288 Top Fan Parition
Aivres KR6288 Top Fan Partition

Those six fan modules cool the system’s storage, CPUs, memory, NICs, and other components.

Aivres KR6288 Intel Xeon CPU And Memory 1
Aivres KR6288 Intel Xeon CPU And Memory 1

In terms of processors, we have dual 4th Generation or 5th Generation Intel Xeon Scalable processors. Each CPU has eight-channel memory and has two DIMMs per channel for 16 DIMMs per CPU and 32 DDR5 DIMMs total. That is important because with over 1.1TB of HBM3e memory on the GPUs, getting a DDR5 to HBM ratio of even 2:1 requires a lot of DIMMs.

Aivres KR6288 Intel Xeon CPU And Memory 2
Aivres KR6288 Intel Xeon CPU And Memory 2

Behind the memory is a dual M.2 riser for boot SSDs, so valuable front-panel SSD slots are not used.

Aivres KR6288 Dual M.2 Boot Riser
Aivres KR6288 Dual M.2 Boot Riser

Next to this are big power cables and plenty of MCIO cables carrying PCIe Gen5 connectivity.

Aivres KR6288 Center Power And MCIO 1
Aivres KR6288 Center Power And MCIO 1

PCIe lanes are a significant challenge with 8x PCIe Gen5 x16 GPUs, 9x PCIe Gen5 x16 NICs, 8x PCIe Gen5 x4 NVMe SSDs, and more. As a result, there are PCIe switches.

Aivres KR6288 PCIe Switch 2
Aivres KR6288 PCIe Switch 2

Another benefit of the PCIe switches is that they provide a pathway for GPU to NIC communication without traversing the CPU fabric.

Aivres KR6288 PCIe Switch 3
Aivres KR6288 PCIe Switch 3

The other challenge in an extensive system like this is that it has multiple baseboards and backplanes, yet PCIe Gen5 signaling only goes so far. As a result, the MCIO cables are everywhere.

Aivres KR6288 MCIO
Aivres KR6288 MCIO

Here is the rear of the system with all of the NICs.

Aivres KR6288 NIC And Expansion Area 1
Aivres KR6288 NIC And Expansion Area 1

Here are the four NVIDIA ConnectX-7 NICs in the first set.

Aivres KR6288 NIC And Expansion Area 2
Aivres KR6288 NIC And Expansion Area 2

In the center, we have the full-height slots and one NVIDIA BlueField-3 DPU installed. Depending on the model, a BlueField-3 DPU can use up to 150W, so when we say this is a high-power and high-performance server, it is not just the GPUs.

Aivres KR6288 NIC And Expansion Area BlueField 3 DPU
Aivres KR6288 NIC And Expansion Area BlueField 3 DPU

In the center, we have the rear I/O and additional slots.

Aivres KR6288 Rear IO Board And MCIO
Aivres KR6288 Rear IO Board And MCIO

Here is the second set of ConnectX-7 NICs.

Aivres KR6288 NIC And Expansion Area 3
Aivres KR6288 NIC And Expansion Area 3

To give some perspective on the amount of networking here, we have nine 400G NICs for a total of 3.6Tbps of network bandwidth. If you use a 32-port 100GbE switch, a modern AI server has more bandwidth to the single node than that entire switch.

Aivres KR6288 Rear
Aivres KR6288 Rear

The hardware is cool, but we also had this in our testing colocation racks to give it a spin.

Aivres KR6288 Performance

Over the years, we have tested many AI servers. There are two major categories where the servers can gain or lose performance: cooling and power. The cooling side concerns whether the CPUs, GPUs, NICs, memory, and drives can all run at their full performance levels. The power side concerns whether we often get different power levels on the NVIDIA GPUs, sometimes due to air or liquid cooling choices. We are running at the official 700W GPU spec here.

Aivres KR6288 GPU Performance

On the GPU side, NVIDIA has made it very easy to get consistent results across vendors. We were able to jump on a cloud bare metal H100 server and re-run a few tests.

Aivres KR6288 NVIDIA HGX H200 8 GPU Performance
Aivres KR6288 NVIDIA HGX H200 8 GPU Performance

NVIDIA claims the H200 offers up to 40-50% better performance than the H100. This is true when you need more memory bandwidth and capacity. Here is a decent range of tests and results. Of course, we did not run the H200s at 1000W, which would have had a bigger impact on the results.

As a preview, we have two more NVIDIA HGX H200 systems we are testing, and the systems are all within a low single-digit percentage of the performance of each other on these workloads when the GPUs are all set to 700W TDP. It is incredible how NVIDIA has made this relatively low drama.

Aivres KR6288 CPU Performance

We ran through our quick test script and compared the Xeon side to our reference 2U platform.

Aivres KR6288 Intel Xeon CPU Performance
Aivres KR6288 Intel Xeon CPU Performance

This is more like a normal server variation, which makes sense given that the top of the server is essentially a 2U server.

Next, let us get to the power consumption.

3 COMMENTS

  1. “designed to house two processors, 32 DIMMs, nine or more 400G NICs, and eight GPUs with over 1.1GB of combined HBM3e memory.” Minor typo at the beginning, I think that you mean tb ;)

  2. Inspur was forced to exit since if they owned a company they couldn’t have a server with H200’s. Kaytus was the one who went to Singapore? Aivres was spun out as the US operations and sales as its own OEM. If they were a Chinese brand owned by Inspur they couldn’t get the H200’s for this server. I’m seeing a H200 server from them, and their business addresses are all in CA, so I don’t think it’s Inspur

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.