Today we have a massive server from Supermicro to run through our tests in the lab, the SuperServer 8048B-TR4FT. This is a 4-Socket Intel Xeon E7 V3 platform that can handle up to 6TB of DDR4 using 96x 64GB sticks. Intel has been making consistent headway in the 4 processor (4P) server and larger market for some time. Unlike the Atom, Xeon E3, Xeon D and Xeon E5 lines, the Intel Xeon E7 series machines are produced by a relatively limited number of Intel’s partners, Supermicro is one of them. That fact, combined with the sheer cost of these systems means that reviews of such systems rarely cross the wire. Today we will provide an overview of some of the Supermicro/ Intel Xeon E7 V3 big iron and give a glimpse of the differences between this and other machines. You can read more about the Xeon E7 V3 in our launch overview.
The Intel Xeon E7 V3 Target Market
As a frame of reference, the Intel Xeon E7 series provides additional RAS features (for reliability) and scale up features (in terms of memory and sockets) to meet some of the more demanding server workloads. Systems like the 8048B-TR4FT are targeted at a few different markets including Traditional Business Processing, Business Intelligence/ Analytics, and Virtualization.
Traditional Business Processing
High memory capacity, advanced reliability and top of the line performance to support the larger amount and criticality of data for Relational DBMS and Enterprise Application Server workloads such as ERP, CRM, HRM and other Mission Critical applications. Oracle and SAP ERP applications are good use cases for the Xeon E7 series.
Business Intelligence / Analytics
High availability, advanced reliability, higher number of cores / threads and increased memory capacity to support In-Memory databases and Big Data Analytics and other Advanced Analytics Solutions. SAP HANA is a good example of a BI/ Analytics application for the Xeon E7 series.
Virtualization
Advanced reliability combined with greater performance and headroom for high VM density, high-capability virtualization and supports VM’s demanding large memory footprint.
Supermicro SuperServer 8048B-TRFT Overview
The SuperServer 8048B-TRFT is a high-end server comprised of four main subsystems: the SC848XTS-R3240BP 4U server chassis and the X10QBi quad processor motherboard, eight X10QBi-MEM1 memory module cards and one AOM-X10QBi-A I/O card. If you were to walk by this server in a datacenter rack, it would look very similar to a standard Supermicro 3.5″ 24-bay 4U chassis. That understated exterior hides the monster that lurks beneath the steel skin.
SuperServer 8048B-TR4FT Key Features
- Quad socket R1 (LGA 2011) supports Intel Xeon processor E7-8800 v3 / E7-4800 v3 family (18-Core), w/ QPI up to 9.6 GT/s
- Up to 6TB DDR4 1866MHz ECC RDIMM/LRDIMMs; 96x DIMM slots (8x memory module boards: X10QBi-MEM2)
- Up to 11x PCIe 3.0 slots (4 x16, 7 x8)
- I/O ports via Add-on Card: 2x 10Gb LAN, 1x Dedicated LAN for IPMI Remote Management, 1x VGA
- Up to 24x 3.5″ Hot-swap SAS2 HDD, 4x 2.5″ Fixed internal SATA drive bays
- 4x 92mm Hot-swap cooling fans, 3x 80mm Hot-swap rear exhaust fans, 2x 80mm rear exhaust fans (option)
- 1620W Redundant (N+1) Power Supplies, Platinum Level (94%)
The Intel C612 PCH platform supports the following features
- Processor/Cache: CPU – Intel Xeon processor E7-8800 v3 / E7-4800 v3 family (up to 18 Cores and 165W TDP) Quad Socket R1 (LGA 2011)
- Memory Capacity: 96x 288-pin DDR4 DIMM slots
- Supports up to 6TB DDR4 ECC RDIMM/LRDIMM in 96 DIMM slots
- Memory Type 1866/1600/1333/1066/800MHz ECC DDR4 SDRAM 72-bit, 288-pin gold-plated DIMMs
- DIMM Sizes 64GB, 32GB, 16GB, 8GB, 4GB
On-Board Devices
- Chipset Intel C602J chipset
- AHCI SATA SATA3 (6Gbps) with RAID 0, 1
- SATA2 (3Gbps) with RAID 0, 1, 5, 10
Network Controllers
- (via AOM) Intel X540 Dual Port 10GBase-T
- Supports 10GBase-T, 100BASE-TX, and 1000BASE-T, RJ45 output
- 1x Realtek RTL8201N PHY (dedicated IPMI)
IPMI
- (via AOM) Support for Intelligent Platform Management Interface v.2.0
- IPMI 2.0 with virtual media over LAN and KVM-over-LAN support
- ASPEED 2400
- Graphics ASPEED 2400
Input / Output
- AHCI SATA 2x SATA3 (6Gbps) ports, 4x SATA2 (3Gbps) ports
- LAN 2x RJ45 10GBase-T ports on AOM
- 1x dedicated LAN port for IPMI on AOM
- USB 2x USB 2.0 ports
Expansion Slots
- 11x PCIe 3.0 slots w/ 48 memory slots
- (4x PCIe 3.0 x16 and 7x PCI-E 3.0 x8)
or
- 8x PCIe 3.0 slots w/ 96 memory slots
- (4x PCIe 3.0 x16 and 4x PCI-E 3.0 x8)
Getting started with the Supermicro SuperServer 8048B-TR4FT
While the SuperServer 8048B-TR4FT is rather large at 7” (4U) x 17.2” x 32.1” and weighs in at 74.5 lbs, the entire shipping package comes in at 115 lbs. The shipping container is made from heavy duty cardboard with a forklift compatible pallet base. Ample foam inserts are used to protect the server from punctures and rough handling. Tool-less server rails are included and an accessory box with needed drive tray screws and power cords.
Here we see the 8048B-TR4FT out of the shipping box and the top lid removed. Dominating the front is the 24 drive bays and notice no extra cooling vents, air must flow through the drive bays. We will see later how this impacts cooling.
On Supermicro’s website it clearly states, “Due to the complexity of integration, this product is sold as completely assembled systems only (including but not limited to CPUs, memory, and HDDs). Please contact your Supermicro sales rep for special requirements.”
Our system came fully certified by Supermicro and that is how we ran all our tests. Supermicro started as a motherboard manufacturer years ago which made it popular with VARs and Silicon Valley startups looking for an ODM alike. Today, the vast majority of Supermicro’s revenues come from complete systems sales.
The SuperServer 8048B-TR4FT’s Motherboard
The heart of the 8048B-TR4FT server is the X10QBi motherboard. If you wanted to see a massive PCB, this is it.
The X10QBi motherboard takes up most of the bottom of the server and measures 19” x 17” in size. Below center we find quad socket 2011 R1’s which the Intel Xeon E7’s fit into. Down the center and on each side are slots for the memory cards with PCIe slots on each side. Power connectors line the front of the motherboard in close proximity to the massive power circuits that feed the CPU’s. Intel rates some Xeon E7 V3 processors with a 165w TDP rating giving them the thermal headroom to run many cores using AVX and/or Turbo Boost.
Here we see how the PCIe slots are allocated to each CPU. The PCIe 3.0 x16 slots are found on either side of the memory slots which allows one to configure the system either to accept more memory or more PCIe expansion cards. CPU #1 on the far left only has two PCIe slots allocated to it, the third slots near the edge of the board is for the AOM-X10QBi-A I/O Module which provides I/O features such as IPMI and dual 10Gbase-T LAN (Intel X540) for the server.
Built upon the functionality and capability of the Intel E7 series processor(s) and the 602J PCH, the X10QBi system provides support for quad-processor-based HPC/ Cluster/ Database server platforms. Here is the corresponding block diagram for the server.
With the Intel QuickPath interconnect (QPI) controller built in, the E7 series processor offers a point-to-point system interconnect interface, enhancing system performance by utilizing serial link interconnections, which allows for increased bandwidth and scalability.
The 602J PCH provides an interface between the QPI-based processor and PCI-Express components. Each processor supports full-width, bidirectional interconnects at the speeds of up to 9.6 GT/s. Each QPI link consists of 20 pairs of unidirectional differential lanes for data transmission in addition to a differential forwarding clock. The x16 PCI Express Gen 3 connections can also be configured as x8, x4, and x2 links to comply with the PCIe Base Specification, These PCe Gen 3 lanes support peer-to-peer read-and-write transactions.
The 602J PCH also offers a wide range of ESI, Intel I/OAT Gen 3, Intel VT-d, and RAS (Reliability, Availability and Serviceability) support. The features supported include memory interface ECC, x4/x8 Single Device Data Correction (SDDC), Flow-through CRC (Cyclic Redundancy Check), parity protection, out-of-band register access via the SMBus, and memory mirroring for data integrity.
Installing CPUs/RAM
Unlike with hard drives, when CPUs or RAM need to be installed, machines are often taken offline. Although Supermicro offers onsite support options, we decided to test serviceability by installing CPU’s and RAM ourselves. Installing CPU’s or removing them in the 8048B-TR4FT can be a bit tricky. We can see there is minimal space inside to flip up the retention levers and get your hand down inside to install the CPU’s. The memory card support frame also extends down into this area which makes hand clearance even tighter.
We did manage to get all four CPU’s installed without issue, we took our time and we were very careful. A small flash light is useful to line up the orientation slots on the CPU/ socket.
After installing the CPU’s, we needed to install the heat sinks. All though this was easier than installing the CPU’s as you did not need to worry about damaging the pins in the socket we needed to line up the mounting screws to the socket mounts without not actually being able to see them. We found it is best to only slightly screw in one screw then head over to the opposite corner and lightly screw that one down. This will give you a nice solid contact for the heat sinks, you can then rotate around and advance the screws in until you have a nice tight fit.
To finish up installing CPU’s and Heat sinks there is a plastic air shroud that covers the heat sink area.
There are a total of 8x X10QBi memory Cards in this system, each memory card has 12 slots for a grand total of 96 slots. At the very minimum a total of 16x memory sticks need to be installed. In our testing we filled all the blue slots for a total of 32x memory sticks. Due to limited budget constraints, we did not have a six figure budget to spend on 96x 64GB DDR4 LRDIMMs however populating all memory slots would be a normal practice in these systems.
The memory cards are secured by two locking latches at the top of the cards, then at the back by a mounting screw for the 6x cards at the very back. The final assembly includes cross bar support that are screwed into place on the sides.
Here we have our memory cards installed and we see a maze of RAM sticks before us. The back of the server is dominated by the memory cards. Airflow from the midplane fans is augmented and given an additional layer of redundancy with the three hot swap fans which draw air through the memory cards and out the rear of the chassis. Having additional redundancy is key with 165w TDP CPUs and memory banks that cost as much as a Ferrari.
Aside from cooling and PCIe expansion slots, the back of the server has four 1620watt high-efficiency redundant Platinum Level (94%) power supplies. Somewhat similar to the 8x GPU Supermicro 4028GR-TR we reviewed, these power supplies are placed at the bottom of the chassis.
The far right of the server is the AOM-X10QBi-A I/O Module which provides VGA output, Intel X540 Dual Port 10GBase-T, 100BASE-TX, and 1000BASE-T network and a dedicated IPMI port.
Down at the bottom are 2x USB 2.0 and COM ports. This combination allows for a very similar management interface to what we see with Supermicro servers. One can easily prototype on a smaller Supermicro server and use the same set of management interfaces on the Supermicro SuperServer 8084-TR4FT in production.
BIOS
Looking over the BIOS for the 8048B-TR4FT we find it very typical for Supermicro motherboards, you should have no troubles getting around the BIOS here and one can see our 4x Intel Xeon E7-8870 V3 CPUs populated in the machine.
For Performance/Watt settings we have Traditional and Power Optimized. Power Optimized to enable Intel Turbo Boost Technology support when the Power Performance State P0 has lasted more than two seconds. This class of server usually runs at a higher load due to its resource density, however configuration levers such as these can be useful in certain scenarios.
Supermicro IPMI interface
On the SuperServer 8048B-TR4FT we find a standard Supermicro web GUI. This is the same management interface available to those prototyping infrastructure on lower power Supermicro products so it is very easy to navigate.
Through the IPMI interface we can monitor power usage to get an overall picture of how much power we have used on the hour/day/week of server uptime.
Beyond the web interface, and the KVM over IP features, Supermicro also has suites of tools such as IPMIview and SSM (review coming soon) to help manage large scale deployments. Based on a standard IPMI 2.0 implementation, infrastructure automation tools such as MAAS for the Ubuntu OpenStack cloud platform work out of the box with the Supermicro interface. Supermicro includes these remote management features with their servers so there is no add-on pack for iKVM functionality (e.g. with HP iLO, Dell iDRAC, Intel RMM, and etc.)
Test Configuration
Our test configuration was certainly nowhere close to the maximum for this platform, however we did want to show off the capabilities of the platform and give some idea of how it performs compared to other systems we have tested.
- CPU: 4x Intel Xeon E7-8870 V3 Haswell-EX
- Motherboard: X10QBi
- Memory: Crucial 16x 16GB DDR4 non-ECC RAM and 16x 16GB DDR4 (384GB Total)
- Storage: Micron P400e 200GB SSD
- OS: Windows Server 2012 R2 and Ubuntu 14
The stats of the E7-8870 V3’s as displayed by HWiNFO.
After bringing up the server we launched the task manager to confirm we had 72 cores/144 threads available.
AIDA64 Memory Test
AIDA64 memory bandwidth benchmarks (Memory Read, Memory Write, and Memory Copy) measure the maximum achievable memory data transfer bandwidth.
Compared to the other dual processor systems we have tested the 8048B-TR4FT blows them out of the water, with more than double the performance, in the case of Reads more than triple. Here we can see one of the main reasons a client would be interested in one of these machines, with up to 6TB of DDR4 memory access is blazing fast which is a huge boost to databases or VM’s stored in RAM.
Linux-Bench Test Results
We ran the 8048B-TR4FT through our standard Linux-Bench suite using Ubuntu as our Linux distribution. Linux-Bench is our standard Linux benchmarking suite. It is highly scripted and very simple to run. It is available to anyone to compare results with their systems, the systems available in our public databse and reviews from other sites. See Linux-Bench.
An example of our Linux-Bench test results for our can be found here: SuperServer 8048B-TR4FT Linux-Bench Test Results and compared to thousands of other systems. Each run has over 50 different benchmark data points for you to compare with other Linux systems. We will make more runs available for you to check out in the near future.
Cinebench R15
CINEBENCH is a real world cross platform test suite that evaluates your computer’s performance capabilities. The test scenario uses all of your system’s processing power to render a photorealistic 3D scene. This scene makes use of various different algorithms to stress all available processor cores. You can also run this test with a single core mode to give a single core rating.
Here we see our first sample test screen shot of a Cinebench run with a multi core score of 6765 which is about double that of the dual processor system we have run before. This would have resulted in a HWBOT World Record but we used Windows Server 2012 R2 which is not allowed for this. However later Dhenzjhen, a well-known over clocker achieved the now current World Record with an incredible score of 7005 using Windows Server 2008 R2 with an identical system: Dhenzjhen Cinebench R15 World Record – These systems are very fast.
We also made a short video of our system running Cinebench showing the speed in which the SuperServer 8048B-TR4FT can blaze through this benchmark.
SPEC CPU2006
SPEC CPU2006v1.2 measures compute intensive performance across the system using realistic benchmarks to rate real performance.
In our testing with SPEC CPU2006 we use the basic commands to run these tests.
“Runspec –tune=base –config=servethehome.cfg ,” then ” int ,” or ” fp .”
To do multi-threaded, we add in ” –rate=144.” We are reporting the median results.
The 8048B-TR4FT is an impressive machine and generates impressive scores on our benchmarks. CPU2006 is one that pushes the system to its limits, in our test case this took the most part of 6 days to complete. In multi-threaded results we again see scores that are at least double compared to our dual processor systems running E5-2699 V3’s (18 core) CPU’s. The slightly slower speed of the E7-8870 V3’s does result in single core scores being slightly lower however.
Temperature tests
For testing server temperatures under load we use AIDA64 Stability Test and use HWiNFO to gather data for our tests.
Here we see the graphical output from our tests on the desktop. We can monitor many different aspects of the system with HWiNFO, even down to single core/thread temps but in our case we used the processor package for our info. Graphing 144 threads would simply be far too many graph lines to be useful.
We can see that even though the 8048B-TR4FT uses only 4x midplane and 3x back cooling fans, the system is well designed for air flow to keep these processors cool. We also messured temperatures from the exhaust fans at the back of the server during this test and saw a high of 81F/27.2C. We attribute the excellent cooling performance to the 3x back mounted cooling fans which help a great deal in moving air through the server, many servers do not have this feature and can result in much warmer processor temperatures.
Power Tests
For our power testing needs we use a Yokogawa WT310 power meter which can feed its data through a USB cable to another machine where we can capture the test results. We then use AIDA64 Stress test to load the system and measure max power loads.
With our system turned off we see about 40watts being used to power IPMI, with power on we see a quick jump to 550watts and later peaks out at about 850watts. The system settles down to about 450watts on idle.
Fully Loaded Stress Tests Power Use
For our tests we use AIDA64 Stress test which allows us to stress all aspects of the system. We used the Traditional Performance/Watt power setting in the BIOS and we see just over 1kW on full power use. This is about what we expected as many of our dual processor platforms can reach 500watts or more. Other factors to consider in power use if of course a full set of 6TB of DDR4 RAM and all front drive bays populated and additional expansion cards. We think our results would be a baseline for a minimal configured system.
Conclusion
The Supermicro 8048B-TR4FT is without a doubt a powerful server with the 4x Intel Xeon E7-8870 V3 Haswell-EX processors we had installed. It achieved the highest CPU benchmark scores we have seen yet in the lab.
Aside from the raw compute of four Xeon E7-8870 V3’s with their 72 cores/144 threads, another compelling reason for a machine like this is the ability to put an ocean of DDR4 memory in one box, up to 6TB. With a system like this we also saw the highest memory bandwidth yet. Combined with the Intel Xeon E7 RAS features, high memory bandwidth and capacity, we can see why these platforms are attractive for mission critical ERP applications and analytics efforts based on packages like SAP HANA. While these systems may seem pricey to server buyers accustomed to purchasing 1U/ 2U web hosting nodes, generally the software and potential downtime costs dwarf that of the server.
We like the fact that the SuperServer 8048B-TR4FT uses IPMI 2.0 compliant management and Supermicro’s standard tools shared across its range. The impact of unifying that base is that infrastructure orchestration development can happen using lower cost Supermicro products and scaled all the way up to and deployed on the SuperServer 8048B-TR4FT system. Supermicro as a company focuses on providing building blocks for its customers, and it is easy to see how even the quad Xeon E7 V3 fits into the larger portfolio.
Bill Excellent Computer, If my company gets big time I will hire you to get this going for me!
Thanks Eric
Please, big iron are large servers, weighing 1000kg with mainframe class RAS. Where you can swap everything (CPU, ram, motherboards, etc) without ever shutting the server down. Some UNIX/mainframe CPUs detect errors and back and replay instructions.. They have 16 or 32 sockets with loads of ram. For instance the Fujitsu M10 sparc UNIX server has 64 sockets and 64 TB RAM. It holds the SAP world record with 844.000 saps. The Oracle Sparc M7 CPU , which can be found in oracle large UNIX server, is the worlds fastest CPU. One M7 CPU reaches 1120 specint2006, almost as fast as these four CPUs. It is up to 11x faster than x96 on database work. Here are 30ish world records
https://blogs.oracle.com/BestPerf/entry/201510_specpu2006_t7_1
So, it looks weird when you say x86 big iron. Four socket servers are tiny and low end. UNIX and mainframe people might think it is misleading when you talk about x86 and Big Iron. The largest x86 server is the HP Kraken with 16 sockets. The SGI uv2000 is only used for cluster workloads, and can not run non clustered workloads such as SAP, there are no one running Sgi uv2000 for SAP. The HP kraken 16 sockets is mid end, with bad RAS. 4sockets are low end
Hello Bill, Thank you for an incredible review. I may have a need for a server like this if it can run a hashcat brute force session at a certain benchmark. Would it be possible for you to run a quick 5 minute test before I purchase? Thanks!
Hi Jim,
For hashcat, would you not look for a GPU-based solution rather than E7 V3? We used hashcat with 8x GPUs (GeForce GTX080 Ti’s)
We are going to have a follow-up, likely next week, on a 10x 1080 Ti system.
Hello Patrick, the has I’m attempting is Scrypt, it is memory hard and very GPU unfriendly. CPU is the only solution at the moment unless I can add 320GB of ram to GPU card..
Jim – want to send me a note at patrick @ this domain with what you are looking to run? I can see if I can get it on a few machines for you.
Thank You! sent.
Patrick,
Great review, I picked up one of these boys for under 2k for a all purpose home, NAS, VM server.
Quick question though:
I want to put in one or two GPUs to get started with deep learning. The problem is the PCI-E power JPW2,3,4,5,6,7 seem to be for the CPU. Can I use a 8 pin PCI-E power splitter cable to provide power to the GPUs? Or should I get a 6 pin molex to 8 pin gpu.
Any recommendations?
Nevermind, I got access to the PSU breakout board. It was a pain.
Didn’t realize it has 8 more 8 pin PCI-E connectors on along with the 6 it uses for the motherboard. Newbie learned something :)
This motherboard https://www.supermicro.com/en/products/motherboard/X10QBi currently supports 12 TB which alone costs almost 1/4M and one of the 4048B-TRFT chassis https://www.supermicro.com/en/products/system/4U/4048/SYS-4048B-TRFT.cfm available has 48 bays. The behemoth is even bigger now.
Note about the website: I’ve been a fan of STH (dare I say an acquaintance of Patrick) for over several years and don’t enjoy solving 2 or 3 CAPTCHAs; when I always use the same IP address (changed only when my ISP decides to) same name, email, and website. Then despite that the post is held for moderation anyways.
The number being over 300GB/s is completely bogus. Local memory access even for the 4th gen Xeon Scalable architecture is around 200GB/s, with the previous generation being even slower, around 130GB/s. Each CPU could use only a single QPI for intercommunication between the CPUs, which could add extra 3x 16GB/s = 48GB/s. Crossing 300GB/s boundary even for 4th gen Xeon CPUs would be a challenge, but back in 2016 if was impossible.