AMD EPYC 9005 Turin Turns Transcendent Performance with 768 Threads Per Server

8

AMD EPYC 9005 Turin Memory Capabilities

AMD EPYC “Turin” is still a 12-channel DDR5 design. DDR5 speeds are up to DDR5-6000, but AMD said it will qualify up to DDR5-6400 for certain customer platforms. AMD is not doing MCRDIMM/ MRDIMM in this generation. Instead, it is opting for the MRDIMM once it becomes the JEDEC standard.

AMD EPYC 9005 Turin Memory
AMD EPYC 9005 Turin Memory

Here are AMD’s figures on its memory controllers. It is getting slightly more than just the raw 25% clock speed increase.

AMD EPYC 9005 Turin Memory 2
AMD EPYC 9005 Turin Memory 2

AMD also worked on its I/O and added more performance for high-speed networking.

AMD EPYC 9005 Turin IO
AMD EPYC 9005 Turin IO

AMD now supports CXL 2.0. AMD is focusing on Type-3 memory expansion support in this generation because it sees that as the dominant model in the next cycle.

AMD EPYC 9005 Turin CXL 1
AMD EPYC 9005 Turin CXL 1

We have done a ton with CXL. CXL Memory or Type-3 devices you can think of as like putting memory on a controller and card that is then plugged into a PCIe lane. That is a big oversimplification, but it is the general idea in a nutshell.

AMD EPYC 9005 Turin CXL Type 3 Memory
AMD EPYC 9005 Turin CXL Type 3 Memory

Not only does AMD support CXL 2.0 devices, but it says that it is getting better performance now. Having memory over retasked PCIe-CXL lanes does come at a latency hit, but one can get more performance and more capacity. If you have seen pieces like ourĀ Marvell Structera X CXL Expansion module, the opportunity is enormous. Putting 12 DDR4 DIMMs onto a CXL card has super circular economy implications.

AMD EPYC 9005 Turin CXL Type 3 Memory Performance
AMD EPYC 9005 Turin CXL Type 3 Memory Performance

AMD also is continuing its secure virtualization and security journey with Trusted I/O.

AMD EPYC 9005 Turin Trusted IO
AMD EPYC 9005 Turin Trusted IO

AMD also has new RAS features that we will let you read through:

AMD EPYC 9005 Turin RAS
AMD EPYC 9005 Turin RAS

Next, let us talk about the bottlenecks we found when testing the new huge CPUs.

8 COMMENTS

  1. Smart Data Cache Injection (SDCI) which allows direct insertion of data from I/O devices into L3 cache could be a huge gain for low latency network IO workloads. It’s similar to Intel’s Data Direct I/O (DDIO).

  2. 768 threads in a server & benchmarks… for a fun test you could run a CPU rendered 3D FPS game. IIRC there is a cpu version of Crysis out there somewhere.

  3. i have a few questions_

    Why does the client needs so many vms to run a workload instead of using containers and drastically reduce overhead?

    Second question: can you go buy a 9175f and test that one with gaming?

  4. I’m surprised you perform benchmarks on such 2P system with NPS=1 and “L3 as Numa Domain” turned off.
    Such a processor deserved an NPS=4 + L3_LLC=On to let the Linux kernel do proper scheduling.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.