AMD EPYC 9004 Genoa New Naming Convention
Before we get to the SKUs, the new generation is the AMD EPYC 9004 generation. The last digit means the fourth generation, the 9 is the new product series. In the middle, we get a core count magnitude and a performance designation.
In the end, we get a P (single socket only) or F (frequency optimized). There are no more letters in the middle of non-custom parts. This is a slight improvement, but it is too bad this is not the 9409624 (series 9, 4th gen, 096 cores, 2.4GHz.)
This is also why companies do not ask me for product naming advice. I once suggested the Intel Xeon Gold 5100 series should be Xeon “Pyrite”.
Next, let us get to the SKUs.
AMD EPYC 9004 Genoa SKUs and Initial Price Lists
The AMD EPYC 9004 Genoa series is launched with eighteen SKUs, or really fourteen different SKUs with four single-socket-only “P” variants.
With these, AMD has the four “F” SKUs for frequency-optimized parts at 16, 24, 32, and 48 core increments. There are five lower core count SKUs with one P variant at 32 cores. Finally, there are density-optimized SKUs with five models plus three P variants. It is somewhat amusing that AMD calls these “Core Density” SKUs since it is also saying we are a few months away from the Bergamo 128 core parts. A 48-core part at that point will not seem like a dense part.
Here is the SKU list we received from. AMD with TDP, cache, frequency, and pricing information.
Looking at that, chart, and adding some bars to it, we can see how the frequencies have increased in this generation, but also the prices, especially around the “F” SKUs.
Here is the cost per core of the Genoa SKUs. The AMD EPYC 9174F appears to have an astronomical price tag. This is designed specifically for maximum performance per core with a 4.1GHz base clock and 256MB of L3 cache for 16 cores. If you are running Windows Server, this is the way to maximize the performance per core you can get for a 16-core license.
Here are the dual-socket parts in a spreadsheet with previous-generation Milan, Milan-X, and Rome parts. A few observations are that we have gone from the Rome generation at 225W (the 7H12 was not in this spreadsheet) to 360W (with a maximum of 400W.) L3 cache figures are similar or higher excluding Milan-X. One can also see that the clock speeds have increased substantially. As an example, the higher-end 64-core AMD EPYC 7763 had a base clock of 2.45GHz. The new AMD EPYC 9554 starts at 3.1GHz base clock. If you recall a ~14% increase in performance on a performance-per-clock basis is good, but so is getting a >25% increase in base clocks that are often hit when the chips are under load. More clocks and more performance per clock mean that even at a given core count, Genoa will be faster.
A quick note, the spreadsheet did not get the $ sign on the pricing for the new parts. That will be fixed over the next few days.
AMD is generally moving its price per core up in this generation. We do not know AMD’s exact pricing methodology, but it seems like AMD is trying to capture the value of having more cores and more performance per core in this generation.
Sorting the giant list by core count we can see the pricing methodology a bit better. AMD is actually using the same $123/ core price it used with the AMD EPYC 7763 on both the EPYC 9654 96 core and EPYC 9634 84 core parts. Again, we do not know AMD’s pricing methodology, but $123/ core does not feel like a coincidence. Having a $123 price per core, but then adding 50% more cores means that the list price now spikes to $11805. A cloud provider would laugh at that number as they were not paying anywhere near $123/ core for Milan parts, but AMD’s methodology makes sense, even if it is leading to a fifth figure on the price. Another way to think about this, compared to the famous $10,008 Xeons of the past is that with inflation at over 8% per year, this is a similar price to the Xeon Platinum 8180/ 8280 at launch. Another way to look at it is that keeping $123/ core means that on an inflation-adjusted basis, AMD is actually decreasing the costs per core.
Turning to the single socket parts, a much more manageable list, AMD is doing something different. There are no more 8 core parts in the entire line. The AMD EPYC 7232P STH readers have been seen many times as that was often the chip used for photos on older systems. It has probably endured close to a hundred installation cycles at this point because it was the least costly CPU we could use. In this generation, the lowest core count “P” part is 32 cores. Also, the price per core has jumped by a massive amount. The two higher-end 96 and 64-core parts are $111/ core, not a far cry from $123/ core.
Here is the cost per core on the “P” side for this generation, one can see the generational increase in these single socket prices clearly.
Overall, it seems like AMD is aiming to increase the value it is getting from its newer faster cores. AMD is seeing a substantial deployment of single-socket servers. With previous generations, AMD was proving the paradigm shift. With Genoa, it seems like the ability to have massive 1P systems means that AMD is not discounting 1P as heavily. Here is Microsoft’s 1P and 24 DIMM design. Single socket servers are now massive in size.
Next, let us take a look at the test platform, and then get to the performance.
Any chance of letting us know what the idle power consumption is?
$131 for the cheapest DDR5 DIMM (16GB) from Supermicro’s online store
That’s $3,144 just for memory in a basic two-socket server with all DIMMs populated.
Combined with the huge jump in pricing, I get the feeling that this generation is going to eat us alive if we’re not getting those sweet hyperscaler discounts.
I like that the inter CPU PCIe5 links can be user configured, retargeted at peripherals instead. Takes flexibility to a new level.
Hmm… Looks like Intel’s about to get forked again by the AMD monster. AMD’s been killing it ever since Zen 1. So cool to see the fierce competitive dynamic between these two companies. So Intel, YOU have a choice to make. Better choose wisely. I’m betting they already have their decisions made. :-)
2 hrs later I’ve finished. These look amazing. Great work explaining STH
Do we know whether Sienna will effectively eliminate the niche for threadripper parts; or are they sufficiently distinct in some ways as to remain as separate lines?
In a similar vein, has there been any talk(whether from AMD or system vendors) about doing ryzen designs with ECC that’s actually a feature rather than just not-explicitly-disabled to answer some of the smaller xeons and server-flavored atom derivatives?
This generation of epyc looks properly mean; but not exactly ready to chase xeon-d or the atom-derivatives down to their respective size and price.
I look at the 360W TDP and think “TDPs are up so much.” Then I realize that divided over 96 cores that’s only 3.75W per core. And then my mind is blown when I think that servers of the mid 2000s had single core processors that used 130-150W for that single core.
Why is the “Sienna” product stack even designed for 2P configurations?
It seems like the lower-end market would be better served by “Sienna” being 1P only, and anything that would have been served by a 2P “Sienna” system instead use a 1P “Genoa” system.
Dunno, AMD has the tech, why not support single and dual sockets? With single and dual socket Sienna you should be able to be price *AND* price/perf compared to the Intel 8 channel memory boards for uses that aren’t memory bandwidth intensive. For those looking for max performance and bandwidth/core AMD will beat Intel with the 12 channel (actually 24 channel x 32 bit) Epyc. So basically Intel will be sandwiched by the cheaper 6 channel from below and the more expensive 12 channel from above.
With PCIe 5 support apparently being so expensive on the board level, wouldn’t it be possible to only support PCIe 4 (or even 3) on some boards to save costs?
All other benchmarks is amazing but I see molecular dynamics test in other website and Huston we have a problem! Why?
Olaf Nov 11 I think that’s why they’ll just keep selling Milan
@Chris S
Siena is a 1p only platform.
Looks great for anyone that can use all that capacity, but for those of us with more modest infrastructure needs there seems to be a bit of a gap developing where you are paying a large proportion of the cost of a server platform to support all those PCIE 5 lanes and DDR5 chips that you simply don’t need.
Flip side to this is that Ryzen platforms don’t give enough PCIE capacity (and questions about the ECC support), and Intel W680 platforms seem almost impossible to actually get hold of.
Hopefully Milan systems will be around for a good while yet.
You are jumping around WAY too much.
How about stating how many levels there are in CPUS. But keep it at 5 or less “levels” of CPU and then compare them side by side without jumping around all over the place. It’s like you’ve had five cups of coffee too many.
You obviously know what you are talking about. But I want to focus on specific types of chips because I’m not interesting in all of them. So if you broke it down in levels and I could skip to the level I’m interested in with how AMD is vs Intel then things would be a lot more interesting.
You could have sections where you say that they are the same no matter what or how they are different. But be consistent from section to section where you start off with the lowest level of CPUs and go up from there to the top.
There may have been a hint on pages 3-4 but I’m missing what those 2000 extra pins do, 50% more memory channels, CXL, PCIe lanes (already 160 on previous generation), and …
Does anyone know of any benchmarking for the 9174F?
On your EPYC 9004 series SKU comparison the 24 cores 9224 is listed with 64MB of L3.
As a chiplet has a maximum of 8 cores one need a minimum of 3 chiplets to get 24 cores.
So unless AMD disable part of the L3 cache of those chiplets a minimum of 96 MB of L3 should be shown.
I will venture the 9224 is a 4 chiplets sku with 6 cores per chiplet which should give a total of 128MB of L3.
EricT – I just looked up the spec, it says 64MB https://www.amd.com/en/products/cpu/amd-epyc-9224
Patrick, I know, but it must be a clerical error, or they have decided to reduce the 4 chiplets L3 to 16MB which I very much doubt.
3 chiplets are not an option either as 64 is not divisible by 3 ;-)
Maybe you can ask AMD what the real spec is, because 64MB seems weird?
@EricT I got to use one of these machines (9224) and it is indeed 4 chiplets, with 64MB L3 cache total. Evidently a result of parts binning and with a small bonus of some power saving.