AMD EPYC 9005 Turin Memory Capabilities
AMD EPYC “Turin” is still a 12-channel DDR5 design. DDR5 speeds are up to DDR5-6000, but AMD said it will qualify up to DDR5-6400 for certain customer platforms. AMD is not doing MCRDIMM/ MRDIMM in this generation. Instead, it is opting for the MRDIMM once it becomes the JEDEC standard.
Here are AMD’s figures on its memory controllers. It is getting slightly more than just the raw 25% clock speed increase.
AMD also worked on its I/O and added more performance for high-speed networking.
AMD now supports CXL 2.0. AMD is focusing on Type-3 memory expansion support in this generation because it sees that as the dominant model in the next cycle.
We have done a ton with CXL. CXL Memory or Type-3 devices you can think of as like putting memory on a controller and card that is then plugged into a PCIe lane. That is a big oversimplification, but it is the general idea in a nutshell.
Not only does AMD support CXL 2.0 devices, but it says that it is getting better performance now. Having memory over retasked PCIe-CXL lanes does come at a latency hit, but one can get more performance and more capacity. If you have seen pieces like ourĀ Marvell Structera X CXL Expansion module, the opportunity is enormous. Putting 12 DDR4 DIMMs onto a CXL card has super circular economy implications.
AMD also is continuing its secure virtualization and security journey with Trusted I/O.
AMD also has new RAS features that we will let you read through:
Next, let us talk about the bottlenecks we found when testing the new huge CPUs.
I’ve come to rely on your cross-generational SKU stacks — hope you get that updated with these new CPUs!
Smart Data Cache Injection (SDCI) which allows direct insertion of data from I/O devices into L3 cache could be a huge gain for low latency network IO workloads. It’s similar to Intel’s Data Direct I/O (DDIO).
There’s great chips here! AMD engineers doing great things.
9965 what a time to be alive
Fascinating to finally see something that hits the limits of x4 PCIe4 SSDs in practice.
768 threads in a server & benchmarks… for a fun test you could run a CPU rendered 3D FPS game. IIRC there is a cpu version of Crysis out there somewhere.
i have a few questions_
Why does the client needs so many vms to run a workload instead of using containers and drastically reduce overhead?
Second question: can you go buy a 9175f and test that one with gaming?
I’m surprised you perform benchmarks on such 2P system with NPS=1 and “L3 as Numa Domain” turned off.
Such a processor deserved an NPS=4 + L3_LLC=On to let the Linux kernel do proper scheduling.