AMD EPYC 7371 Benchmarks
For this exercise, we are using our legacy Linux-Bench scripts which help us see cross-platform “least common denominator” results we have been using for years as well as several results from our updated Linux-Bench2 scripts. At this point, our benchmarking sessions take days to run and we are generating well over a thousand data points. We are also running workloads for software companies that want to see how their software works on the latest hardware. As a result, this is a small sample of the data we are collecting and can share publicly. Our position is always that we are happy to provide some free data but we also have services to let companies run their own workloads in our lab, such as with our DemoEval service. What we do provide is an extremely controlled environment where we know every step is exactly the same and each run is done in a real-world data center, not a test bench.
Python Linux 4.4.2 Kernel Compile Benchmark
This is one of the most requested benchmarks for STH over the past few years. The task was simple, we have a standard configuration file, the Linux 4.4.2 kernel from kernel.org, and make the standard auto-generated configuration utilizing every thread in the system. We are expressing results in terms of compiles per hour to make the results easier to read.
This is an extraordinary result. Our Linux kernel compile benchmark is not just multi-threaded performance bound. One can see the single threaded performance and even memory bandwidth impact performance.
Here we can see the AMD EPYC 7371 not only passes the Intel Xeon Gold 6130 by a significant margin, but it also enters the AMD EPYC 24-core performance levels. This is a great example regarding why clock speed matters.
c-ray 1.1 Performance
We have been using c-ray for our performance testing for years now. It is a ray tracing benchmark that is extremely popular to show differences in processors under multi-threaded workloads. We are going to use our 8K results which work well at this end of the performance spectrum.
Here you and see that the AMD EPYC 7371 puts a fairly significant delta between itself and the AMD EPYC 7351P based on clock speed. As a benchmark, c-ray is highly sensitive to core counts, clock speeds, and cache differences. We do not like using it across different architectures from the same vendor or from multiple vendors as it exaggerates performance gains. When AMD showed off the first EPYC “Rome” generation demo using c-ray versus Intel Xeon Scalable, this was done to show off what you are seeing above. AMD has an architectural advantage here.
7-zip Compression Performance
7-zip is a widely used compression/ decompression program that works cross-platform. We started using the program during our early days with Windows testing. It is now part of Linux-Bench.
Our custom has been sorting this chart based on decompression MIPS. In this generation, that favors the AMD EPYC over Intel Xeon Scalable results so we ask our readers to take a critical look at this one.
One point that the chart draws out is that this is a high-speed 16 core per socket part. The Intel Xeon Gold 6134 and Gold 6144 (we do not have the 6144’s) are very fast chips. If you had a 16 core per machine limit, then these would certainly be in the running as you can see when we brought in the dual Xeon Gold 6134 results above. The Intel Xeon Gold 6130 cannot keep up with the AMD EPYC 7371 here which points to Intel’s biggest competition to the AMD EPYC 7371 may actually be a dual socket 8-core per CPU server.
Intellectually, that statement should give you pause. Is a dual Intel Xeon 8-core server the same as a single core AMD EPYC 7371? Extrapolating is a quad Intel Xeon 8-core server equivalent
to a dual AMD EPYC 7371 server? These are questions that the AMD EPYC 7371 raises that we did not have to ask when we could look to Intel’s frequency optimized parts and AMD’s prior lack thereof and dismiss the notion out of hand. AMD is forcing enterprises to start asking these questions by delivering a solid frequency optimized CPU.
NAMD Performance
NAMD is a molecular modeling benchmark developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. More information on the benchmark can be found here. We are going to augment this with GROMACS in the next-generation Linux-Bench in the near future. With GROMACS we have been working hard to support Intel’s Skylake AVX-512 and AVX2 supporting AMD Zen architecture. Here are the comparison results for the legacy data set:
As we are going to see in our AVX2/ AVX-512 GROMACS results, that has a big impact on molecular simulation. At the same time, when we look at raw compute performance, the AMD EPYC 7371 can open up a big gap over the Intel Xeon Gold 6130 due to higher base clock frequencies and raw compute performance. The dual Intel Xeon Gold 6134 solution is very competitive but again compares dual socket frequency optimized Intel to single socket frequency optimized AMD.
Sysbench CPU test
Sysbench is another one of those widely used Linux benchmarks. We specifically are using the CPU test, not the OLTP test that we use for some storage testing. Here we are going to use the single threaded test.
One way to look at this chart is that AMD is dramatically better than Intel. That is not a great way to interpret it. This is one of those cases where AMD has an architectural advantage that skews results. Instead, we wanted to focus on clock speed for a second. One can see that the AMD EPYC 7371 provides a significant jump in single thread clock speed over previous AMD EPYC chips. At the same time, one can see that the Intel Xeon Gold 6100 series, generally top out around 3.7GHz, save for a few SKUs while the Intel Xeon Gold 5100 and Xeon Silver lines have lower single-thread performance.
The trick to the AMD EPYC 7371 performance is not just high single-core speeds. Instead, it is maintaining higher clocks throughout the range of cores and threads used.
OpenSSL Performance
OpenSSL is widely used to secure communications between servers. This is an important protocol in many server stacks. We first look at our sign tests:
Here are the verify results:
When it comes to 16 core CPUs, the AMD EPYC 7371 is the top of our list. We do not have the Intel Xeon Gold 6142 which has a 500MHz higher base clock but the same maximum turbo clock speed as the Gold 6130. With that, we think Intel would squeeze out a victory over the AMD EPYC 7371 on our OpenSSL benchmarks.
UnixBench Dhrystone 2 and Whetstone Benchmarks
Some of the longest-running tests at STH are the venerable UnixBench 5.1.3 Dhrystone 2 and Whetstone results. They are certainly aging, however, we constantly get requests for them, and many angry notes when we leave them out. UnixBench is widely used so we are including it in this data set. Here are the Dhrystone 2 results:
Here are the whetstone numbers:
Here the AMD EPYC 7371 is very competitive with dual Intel Xeon 6134 processors. In single threaded dhrystone 2, the figures are within the margin we use to describe a dead heat due to testing variations.
We wanted to take a second for those who are wondering Gold 6130 v. dual Gold 6134 and why they have the same maximum turbo clocks yet they are so far in performance on these tests. The base clock and all core turbo clocks on the Gold 6134 are higher. While one gets similar single-threaded performance, the dual Gold 6134 CPUs are able to maintain higher turbo clocks over more cores. That is what also makes the AMD EPYC 7371 performance so interesting with a 700MHz clock increase over the EPYC 7351(P) CPUs.
GROMACS STH Small AVX2/ AVX-512 Enabled
We have a small GROMACS molecule simulation we previewed in the first AMD EPYC 7601 Linux benchmarks piece. In Linux-Bench2 we are using a “small” test for single and dual socket capable machines. Our medium test is more appropriate for higher-end dual and quad socket machines. Our GROMACS test will use the AVX-512 and AVX2 extensions if available.
With dual port FMA AVX-512, the Intel Xeon Gold 6130 and the rest of the Intel Xeon Gold 6100 and Platinum 8100 lines are monsters in this test. At the same time, we wanted to show that when you go below the Gold 6100 line, the AMD EPYC 7371 with AVX2 is quite competitive. Intel defeatures its mainstream Gold 5100 and Silver 4100 parts both in terms of clock speed and AVX-512 performance. The AMD EPYC 7371 will not see competition below the Intel Xeon Gold 6100 line in this generation.
Chess Benchmarking
Chess is an interesting use case since it has almost unlimited complexity. Over the years, we have received a number of requests to bring back chess benchmarking. We have been profiling systems and are ready to start sharing results:
Here one can again see the gap is closed in the step function that formerly defined the gap between AMD’s 16 core and 24 core CPUs. The AMD EPYC 7371 bridges this gap and puts the Intel Xeon Gold 6130, one of Intel’s competitive higher clocked 16 core offerings, well behind.
Since this is a more significant piece than most, we wanted to give a little more color aside from our standard tests so we are going to have some bonus workloads before moving into our tests.
On page 4 how do I know which terminal is for which processor?
I started reading and I was like whoa baby this is a 5 page CPU review why? Then I got into it at lunch and I know why. You’re right on the impact. That per-core performance is why we haven’t moved our Windows Hyper-V cluster to EPYC or even started to test it there.
You didn’t really mention it, but the clock speed also helps licenses in VMs. Maybe that’s obvious, but maybe to some it isn’t.
For SQL server you’re right that there’s other chips that might be better but those are extremely targeted products. It’s like EPYC’s first parts covered 75% of the market. These get them 20% more. Then there’s 5% that Intel still has better parts for.
Randy Bostrom if you look at the prompt text in those screenshots you’ll kick yourself. It’s a long review. It’s taken me 40 minutes to read.
CAN SOMEONE GET THESE GUYS GOLD 6142 CHIPS PLEASE
It can’t be that hard! Make it happen. @Intel if you don’t we’re gonna say you’re chicken.
I read STH reviews for “it is not released in a vacuum” phrase. What a peculiar feeling it triggers… Especially today, when it got paraphrased. And with a typo!
Patrick,
Why would you not have included the 7351P CPU into the “16 Core CPU Market” grid?
It is ~$400 cheaper than the 7351 CPUs, while providing (IIRC) the same performance. When talking about core-based licensing optimization it seems that this is the CPU model that is the “one to beat”.
{So much so that AMD really needs to make sure that they offer a 7371P variant, to help lock up that licensing market}
I’m with BinkyTO when is there going to be a 7371P version? I know what I want for Christmas. I’ll pay the power man.
It’s good to see AMD’s finally getting serious about frequency optimized. Maybe it’s better called performance per core optimized but that’s why we couldn’t use the EPYC that’s out there. Chips looked cheap but they’d cost too much for us to license servers for our environments. Your analysis is spot on.
6142 max all-core turbo is 3.3 GHz on 16 cores at around $2,950
Typo: Standard instead of Stanard
The base Windows Server 2019 license for Windows Server 2019 Stanard and Datacenter are 16 cores which
We we’re having this exact conversation in the office last week. We have about 40 Windows DC hosts. We’re RAM limited at 16C and 768GB. We want to EPYC but the single core is too low. This will fix
When in Q1?
That p5 SPEC CPU2017 here’s official data since I didn’t get it on the first read through
2x Gold 6130 164 https://www.spec.org/cpu2017/results/res2017q4/cpu2017-20171114-00735.html
2x EPYC 7351 165 https://www.spec.org/cpu2017/results/res2018q4/cpu2017-20180918-08912.html
2x Gold 6142 178 https://www.spec.org/cpu2017/results/res2017q4/cpu2017-20171211-01573.html
If Gold 6130 to 6142 is up to 500mhz more. EPYC 7351 to EPYC 7371 is 700MHz more, and more at the top single thread speed, then EPYC 7371 is going to do 185+. That wasn’t clear from the article on p5 but when I read the words then I looked at Cisco’s official results it make perfect sense. Maybe that’ll help someone else out
If they can do 4GHz at 225w crank it up. Intel’s been on this low power stuff for years but our data centers sell us metered power plus a cooling adder. 1 less server per rack would pay for 50-60w per server real quick.
If this comes in under $2k it’ll be a great value but you’ve still have to get people to welcome AMD again
Hopefully sth to go against the 6144 next.
I love STH for this review and analyst piece combined in one. They’re like analysts who have hands-on experience not just theoretical exposure.
For us, this is an important launch but not because we’re going to buy it. Our IT org will only buy second gen products. Rome would’ve been considered first gen for our Windows cluster. I’m going to put these in a RFP draft when they’re on the Dell site so I can get them nixed for being first gen. That’ll give me documentation to show Rome is a second gen product later in 2019. Next-level thinking here.
I was hoping for some 7nm EPYC by end of Q1/2019.
AMD has a once in a lifetime chance against Intel with Milan, they better hurry up to enjoy it as long as possible.
I buy some EPYC 7371 today for around 1550 USD. This price is absolutly fantastic and intel have nothing competive.
What cooler did you guys use? The Tyan heatsink says it doesn’t support the 200 watt processor.