At Hot Chips 29 (2017) AMD presented new details about its newest server chip. Two details we thought would be interesting to STH readers: the Infinity Fabric update as well as the Multi-Chip Module (MCM) cost savings. If you want to read more in-depth, check out ourĀ AMD EPYC 7000 Series Architecture Overview for Non-CE or EE Majors.
AMD EPYC Infinity Fabric Update
One area that we noticed a significant delta in AMD EPYC Infinity Fabric during our testing was on-package versus socket-to-socket bandwidth. SeeĀ AMD EPYC Infinity Fabric Latency DDR4 2400 v 2666: A Snapshot. At Hot Chips, AMD revealed a key detail in terms of the die-to-package topology: there are four die-to-die Infinity Fabric links per die, but only three are used.
The key reason for this is to maintain a high-performance on-package link with minimal trace lengths. We were told in a private briefing that the above is an artists rendition but based on their actual die layout. One can see that only three Infinity Fabric links are connected on each die.
For those wondering about dual socket configurations, the G0-G3 I/O links are used for socket-to-socket communication.
Also interesting about that diagram is the DDR4 linkage You will notice that the channels are represented as MA, MD, MC, MB instead of MA, MB, MC, MD working from the outside in. MA and MB are connected to one die while MC and MD are connected to another.
AMD EPYC MCM v. Monolithic Cost Savings
Here is AMD’s comparison, using their modeled production costs, of going MCM versus a monolithic die at 32 cores. Key here is that adding transistors for interconnects adds costs as do redundant features. For example each die has a server controller hub but only one is fully used.
In the end, to enter the market, the MCM module makes sense. One of the major topics at the conference today was MCM with Intel’s morning paper called “Heterogeneous Modular Platform” itself advocating its EMIB interconnect for MCM modules. AMD claims that it costs 59% as much as it would to manufacture a MCM 32 core package versus a monolithic die. This includes the approximately 10% area overhead for MCM related components.
At the end of the day, monolithic dies do provide high performance, but we are moving towards a MCM world if you believe just about every relevant presentation at Hot Chips 29.
This gives a pretty good inside why the yields for Ryzen are so high, there is so much spare in the connections. Only the best are picked for EPYC, 2nd choice is for Threadripper and the badly broken ones are for Ryzen. All the way down to Ryzen 3 this should give yields above 99%.
@ Misha Engel
EPYC is using a new stepping that is yet to be spotted in Threadripper and Ryzen products
Epyc is an awesome platform. Unfortunately, it doesn’t appear available to purchase anywhere. Anyone know a realistic general availability?
Supermicro is shipping retail units like the one we are using to review.