The AMD EPYC 9005 is here. Not content with Intel Xeon 6900P reasserting Intel Server Leadership for a few weeks, the new AMD EPYC 9005 turned in what one might call a transcendent performance. In reality, AMD’s formula was simple: Increase TDP by 25%+. Increase core counts by 50%. It uses new process technology and update its server processors to Zen 5/ Zen 5c architecture. Combining all that, we have the newest AMD EPYC processors that are performing exceptionally well.
Of course, since this is a major industry launch, we have a video version:
We suggest watching this in its own browser, tab, or app for the best viewing experience. AMD sent us the hardware we are using for testing, including the server platform and three sets of SKUs. We must say this is sponsored by AMD. At the same time, we also needed to augment the configuration with parts that Solidigm and Broadcom sent us for other pieces, but that we decided to use here.
One of the most important findings we had while doing our performance testing was that using slower SSDs and NICs led to notable performance loss, especially at 192 cores.
Although neither company sent us these parts specifically for this piece, but we had these samples on hand and just put them to use since they were here. We must also say this is sponsored by Solidigm and Broadcom. Now that that is out of the way let us get to it.
AMD EPYC 9005 “Turin” SKUs
Let us start with the first and perhaps most impactful change. AMD EPYC “Turin” is the new Zen 5 and Zen 5c generation of server processors.
Since the SKU stack and pricing are big points here, let us just lead with what many of you want to see:
AMD has Zen 5c core counts ranging from 96 to 192 cores. Still, these are all “Turin” so there is no Genoa/ Bergamo split. Originally, when powering on our test platform, I saw the AMD EPYC 9755 and thought it was a Zen 5c part, like the EPYC 9754 was a Zen 4c part.
That was wrong. AMD made a strong move to unify the stack and it makes sense. Intel is showing P-cores and E-cores as different architectures and, importantly, with and without SMT. AMD is saying that it has the same instruction set with different core counts, clock speeds, and cache. Strong move AMD.
Two of the SKUs are 500W parts, which require updated power delivery in platforms. We still get “P” series single-socket SKUs, but only up to 96 cores. That is a change from previous AMD generations since we do not get a full core count P-series part.
The chip that is perhaps the most fun is the AMD EPYC 9175F. 16 cores for Microsoft and other per-core licensing, but with a 5GHz turbo and 512MB of L3 cache (32MB per core). That chip looks like a monster if you are paying on a per-core basis.
AMD’s list pricing seems very aggressive compared to what we are seeing from Intel in this generation. Intel list prices are designed to be discounted and it seems that AMD is taking a different approach.
We did not get a Genoa-X update in this generation. AMD told us that Genoa-X is still a strong platform, but we do not have roadmap updates beyond that other than it is a potential technology for future parts. While Turin is going to be a mass-market product, let us get real here. If you are building something like a 24-bay hard drive storage server or something else that does not need the full 12-channel SP5 socket, PCIe Gen5, and so forth, then you are probably better off getting AMD EPYC Siena, Milan, or the EPYC 4004 series.
We already covered the Zen 5 and Zen 5c improvements in a lot of detail earlier this year. Unlike on the client side, server chips are homogeneous with either Zen 5 or Zen 5c, not both.
If you are familiar with the 4th Gen AMD EPYC SoCs, then this is going to be the big one. Here are all of the features of the new SoCs. Major differences are highlighted such as the core configurations, 512b AVX-512, memory speeds, CXL support, and more. This is the slide many will want to see at STH.
Still, it is the same instruction set between the two, so unlike Intel with different P and E cores, the right way to think about the new cores is that they are being tuned with different frequency, power, cache, and density profiles. The overall platform, however is the same instead of Intel’s two platform (Xeon 6700 and Xeon 6900 series) model. AMD instead packages higher core density Zen 5c cores and higher cache and clock speed Zen 5 cores into different CCDs and packages.
I asked AMD about the 32x SATA support since that seems like an inefficient use of lanes. AMD agreed that it is not the most useful today, but needs to have the support for platform compatibility. My sense is that we will see SATA support drop in some future generation because that is a very legacy feature that can be supported well by add-in cards for those applications that need it.
AMD continues to have the ability to use its I/O lanes either for socket-to-socket or as more PCIe lanes for single socket servers. AMD’s single socket offering has more PCIe Gen5 I/O than the Intel Xeon 6900P, but the upcoming Intel R1S single-socket solution will have more. In 2P configurations, the Intel Xeon 6900P has 192 PCIe Gen5 lanes while AMD is at up to 160 lanes. Competition is good!
Next, let us get to memory capabilities.
I’ve come to rely on your cross-generational SKU stacks — hope you get that updated with these new CPUs!
Smart Data Cache Injection (SDCI) which allows direct insertion of data from I/O devices into L3 cache could be a huge gain for low latency network IO workloads. It’s similar to Intel’s Data Direct I/O (DDIO).
There’s great chips here! AMD engineers doing great things.
9965 what a time to be alive
Fascinating to finally see something that hits the limits of x4 PCIe4 SSDs in practice.
768 threads in a server & benchmarks… for a fun test you could run a CPU rendered 3D FPS game. IIRC there is a cpu version of Crysis out there somewhere.
i have a few questions_
Why does the client needs so many vms to run a workload instead of using containers and drastically reduce overhead?
Second question: can you go buy a 9175f and test that one with gaming?
I’m surprised you perform benchmarks on such 2P system with NPS=1 and “L3 as Numa Domain” turned off.
Such a processor deserved an NPS=4 + L3_LLC=On to let the Linux kernel do proper scheduling.