AMD EPYC 7551P Benchmarks
For this exercise, we are using our legacy Linux-Bench scripts which help us see cross-platform “least common denominator” results we have been using for years as well as several results from our updated Linux-Bench2 scripts. At this point, our benchmarking sessions take days to run and we are generating well over a thousand data points. We are also running workloads for software companies that want to see how their software works on the latest hardware. As a result, this is a small sample of the data we are collecting and can share publicly. Our position is always that we are happy to provide some free data but we also have services to let companies run their own workloads in our lab, such as with our DemoEval service. What we do provide is an extremely controlled environment where we know every step is exactly the same and each run is done in a real-world data center, not a test bench.
Python Linux 4.4.2 Kernel Compile Benchmark
This is one of the most requested benchmarks for STH over the past few years. The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make the standard auto-generated configuration utilizing every thread in the system. We are expressing results in terms of compiles per hour to make the results easier to read.
Overall, this is some great performance by the AMD EPYC 7551P. There is a clear differentiation between the three 32 core AMD EPYC 7001 series options. Here Intel architectures tend to perform better per core due to the fewer number of NUMA nodes. At the same time, performance per dollar wise, AMD EPYC 7551P is still one of the top points on this chart.
c-ray 1.1 Performance
We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads. We are going to use our new Linux-Bench2 8K render since it teases out more differences in this CPU segment than our older 4K results.
You can see the cluster of 32 core AMD EPYC results. AMD EPYC 7001 CPUs and other Zen CPUs such as consumer Ryzen parts do very well in this test. It is a primary reason why AMD uses Cinebench, a similar benchmark, heavily in their marketing.
Just for fun, here we have every single AMD EPYC 7001 series CPU both single and dual socket configurations and where the AMD EPYC 7551P falls in the continuum.
The 32 core single socket part is smack in the middle of the performance spectrum for AMD EPYC. In terms of overall performance, this is a quick way to show the wide range that AMD EPYC covers at this point.
7-zip Compression Performance
7-zip is a widely used compression/ decompression program that works cross-platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.
This is one of those workloads where the AMD EPYC 7551P just wins. It is slightly slower than the EPYC 7601, but they are relatively close. If you are doing a price/ performance comparison to Intel here, it would look awkward. While the AMD EPYC 7551P parts are single socket only, they are able to outpace dual socket Xeon Scalable CPUs in the same price range. The Xeon Scalable parts get more DIMM slots but can address less memory.
NAMD Performance
NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here. We are going to augment this with GROMACS in the next-generation Linux-Bench in the near future. With GROMACS we have been working hard to support Intel’s Skylake AVX-512 and AVX2 supporting AMD Zen architecture. Here are the comparison results for the legacy data set:
With 32 Zen cores, we see the performance that we would expect. Having the Xeon D-2183IT here distorts the scale. It is Intel’s $1800 offering for the embedded market and is single socket only. The AMD EPYC 7551P does not enjoy an enormous price premium over the Xeon D-2183IT, but it enjoys a large performance advantage. If AMD EPYC 3000 series platforms become more broadly available there will be even more competition throughout the server SKU stacks.
Sysbench CPU test
Sysbench is another one of those widely used Linux benchmarks. We specifically are using the CPU test, not the OLTP test that we use for some storage testing.
Taking a look at the “P” series parts, one can see some significant separation. Far more than 32 core single or dual socket configurations offer. If you are looking for a single socket 32 core AMD EPYC, this is what we recommend. The ADM EPYC 7601 is faster but costs about twice as much.
OpenSSL Performance
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
Here are the verify results:
These results show just how competitive the part is. Here the AMD EPYC 7551P at about $2300 is handily faster than the Intel Xeon Gold 6136 and Gold 6138 CPUs that are around the same price. It is also faster than the Intel Xeon Gold 6152 which costs about $1000 more.
UnixBench Dhrystone 2 and Whetstone Benchmarks
Some of the longest-running tests at STH are the venerable UnixBench 5.1.3 Dhrystone 2 and Whetstone results. They are certainly aging, however, we constantly get requests for them, and many angry notes when we leave them out. UnixBench is widely used so we are including it in this data set. Here are the Dhrystone 2 results:
And the whetstone results:
It is a bit hard to see on these charts, but the AMD EPYC 7551P gives up some single thread performance for excellent raw throughput across a larger number of cores.
GROMACS STH Small AVX2/ AVX-512 Enabled
We have a small GROMACS molecule simulation we previewed in the first AMD EPYC 7601 Linux benchmarks piece. In Linux-Bench2 we are using a “small” test for single and dual socket capable machines. Our medium test is more appropriate for higher-end dual and quad socket machines. Our GROMACS test will use the AVX-512 and AVX2 extensions if available.
AVX-512 helps Intel a lot here. On the single FMA AVX-512 Intel Xeon Gold 5100 and Silver 4100 in the price range of the dual AMD EPYC 7551P Intel falls behind. Still, with more cores, the AMD EPYC 7551P AVX2 units are able to almost match pace with the Intel Xeon Gold 6138.
Chess Benchmarking
Chess is an interesting use case since it has almost unlimited complexity. Over the years, we have received a number of requests to bring back chess benchmarking. We have been profiling systems and are ready to start sharing results:
Again, there are many tasks that the having more cores simply wins at. We added the dual Intel Xeon E5-2697 V3 results in here. Those are two generations old now but were $2700 each. For $2300 AMD is now offering just under what Intel offered for $5400 in 2014-2015. Although there are some workloads where having four NUMA nodes is not helpful, in many workloads AMD’s storyline of offering a single socket solution to match or exceed Intel’s dual socket performance holds. This is certainly true in the Gold 5100 series and under range.
Next, we are going to look at the AMD EPYC 7551P power consumption then discuss where it fits into today’s crowded CPU market. We are going to end with our final thoughts.
Holy hell at that c ray chart with all the EPYCs. That’s EPYC. Good job STH.
What you talked about some, but I think you can talk about more is that the $1100 cost to upgrade over the 7401P is worth it if you’ve got lots of drives and RAM in a virtualization server
Looks like a scam. Buy Intel.
“The only real competition we see on the AMD EPYC side is with the AMD EPYC 7401P.”
I think the competition are actually dual-socket EPYC 7281/7301.
Thanh – That is a good point on the dual EPYC 16 core competition. Moving to dual socket has some benefits, but costs more operationally.
I wonder why Dell enumerates CPUs in this strange way? To me it looks like they like to do SMP system from NUMA hence enforcing wrong mapping (seen that on my old HP 585 first gen. Option to enforce SMP behaviour although machine was 4 nodes numa). But w/o testing and looking into BIOS options this is hard to claim of course…
Still hoping mining/cryptonight benchmarking becomes standard test
Almost any way you look at EPYC, AMD is to be complimented for an outstanding engineering achievement!
Any thoughts on why AMD hasn’t picked up more market shares in DC? I am guessing because the lead time is way longer?
Or are they all waiting for Zen2?