There is an aesthetically displeasing trend in data centers today, servers are “humping.” Before we get too far, this is more of a camel “hump” rather than another form of that word. The trend originated from deep learning servers, specifically utilizing NVIDIA GTX and Titan X series GPUs. In the last week, we have visited five data centers on two continents and in each have seen racks/ aisles full of servers taking part in this humping trend. Even some of the servers in the STH/ DemoEval lab have started to take part.
What is Server Humping, and More Importantly Why?
Server humping is officially an era in the data center. The term we have come across often lately describes adding a “hump” to GPU compute server chassis to fit the power cables of NVIDIA Titan and GTX series GPUs. Here is an example from our ASRock Rack 3U8G-C612 8-GPU server review:
Here is the same server with a NVIDIA hump turning the 3U machine into a 4U server:
The key to this trend is that most GPU servers were built for GPU compute focused cards. These cards typically are passively cooled and consume a maximum of 3U of space. Here is an example of that same server filled with NVIDIA GRID M40 cards:
And another example with AMD FirePro W9100’s:
As you can see, these data center GPUs (and Intel Xeon Phi PCIe cards as well) utilize rear facing power connectors.
While NVIDIA likes to tout is amazingly fast-growing data center market, there is a trend that everyone seems to accept (except NVIDIA) but is the way things are done: using Titan/ GTX GPUs for machine learning clusters. The reason for this is simple, NVIDIA charges a 7-8x price premium for its data center parts. Although it castrates double precision and half precision performance in its consumer cards to keep that premium alive, the deep learning clusters can use single precision and take advantage of consumer hardware. There is a lot of logic to this. One of NVIDIA’s major success drivers has been the fact that you can start by using its desktop GTX cards and utilize CUDA compute for a few hundred dollars. Of course, for students of deep learning, the fact that the same GPU can also play games is gravy which has made NVIDIA GTX a sales powerhouse.
As these companies utilize GTX and Titan cards in GPU compute servers meant for passively cooled cards, the power connectors cause servers to be humped. Here is an example of a NVIDIA GTX 1080 Ti 11GB GPU in a GPU compute server with power cables attached:
As you can see, the GTX series has top facing power connectors. The NVIDIA GTX 1080 Ti is a $700 or so GPU with a 250W power budget so it needs an extra 8-pin and 6-pin power connector. When installed in a GPU compute server originally designed for data center cards (3U of height required), this presents an almost laughable situation:
The power cables extend past the top of common server chassis. I can note that shortly after taking this photo I was made fun of for not having the 90-degree power connectors by an individual with aisles full of these Supermicro GPU compute servers (his company still uses the hump.) The solution to the cables is adding a hump. The Supermicro SYS-4028GR-TR(2) hump is smaller than the ASRock Rack one and makes the total server 4.5U instead of 4U.
We purchased ours after the chassis (part number: MCP-230-41803-0N) for about $110 each for our two servers.
While you may think this “humping” trend is small. Just this week we have seen approximately 400 servers using GTX 1080, GTX 1080 Ti or Titan X (Pascal) GPUs. While there are some manufacturers (e.g. Tyan) with designs specifically to address this trend, most vendors are solving the GTX usage with a “hump” of some sort.
In terms of market impact, even in this small sample, that is 400 servers “humping” with 8 GTX GPUs each = 3200 GPUs with a $2M to $4M market price for the GPUs just in a few small data centers. These GPU compute servers drastically raise average server selling prices as the CPU, Mellanox Infiniband, RAM, SSD and chassis generally are $5000-$7000 and the GPU component often doubles that.
Final Words
If you were following the NVIDIA data center business looking at the Tesla/ Quadro/ GRID lines is not sufficient. Server humps allowing GTX cards to be used in the data center (with reduced double / half precision performance) are driving significant numbers of data center GPU sales. While gaming is likely a large driver of high-end NVIDIA GTX series sales, these data center customers are consuming 100+ GPUs at a time. Until we see GPU manufacturers address this trend, we are likely going to see humps. We have been told if you order GPUs in large enough quantities some vendors will re-solder power connectors. That should give some idea in terms of the magnitude of these purchases.
Here’s a thought…make cards with power connectors on the bottom that can snap into the board below. Obviously somewhere ofa niche market but still a better idea