Intel Xeon Ice Lake Edition Marks the Start and End of an Era

April 6, 2021

3rd Generation Intel Xeon Scalable “Ice Lake” SKU List

The 3rd Generation Intel Xeon Scalable SKU stack is now an absolute mess. Intel actually has a great diagram, probably the best we have ever gotten, so we are going to use that for discussion.

3rd Generation Intel Xeon Scalable SKU Stack April 2021

First, the left side is Cooper Lake. These are the H and HL SKUs launched in 2020. The L is significant here because these SKUs still have a premium for the “Large” memory footprint. AMD competition has meant that, as we have been talking about since last summer, Intel is no longer charging a premium for higher memory capacity. The last projections we saw had the memory market as something like 85-90% 32GB and 64GB DIMMs in 2021, but this does mean one can use PMem without hitting higher memory levels.

We also get PMem 200 support on all but three Xeon Silver SKUs. Pricing is also generally lower than what we saw at the initial Cascade Lake Xeon launch and in many of the classic 28 core and under SKU levels well below even the refresh SKU pricing. For example, the 28 core SKUs top out at $3072 with the Xeon Gold 6348 but the Gold 6258R is around a third more.

Overall, Intel did a good job of unifying the feature set across its stack, albeit with some caveats. While this generation has standard 2 FMA units across the stack for AVX-512 (necessary with the Intel Core i9-11900K supporting AVX-512 now) We still get down-clocked maximum memory speeds on the Gold 5300 and Silver 4300 series. We still think the Gold series needs to be unified to get rid of the Gold 5300/ Gold 6300, but Intel is moving this way by reducing feature distinctions.

3rd Generation Intel Xeon Scalable Processor Level Differentiation

The Platinum 8380 is $8099 which is interesting for two reasons. First, it is around 20% lower cost than the Intel Xeon Platinum 8280 showing that Intel does not have confidence in it having a competitive edge against the EPYC 7763. Second, Intel messaged into Q2 (maybe Q3?) of last year with its partners that it would top out at 38 core SKUs. Our sense is that this is not designed to be a high-volume SKU, but Intel wanted it to show that it has a smaller gap to AMD and Arm core counts.

Just to give some sense of how complex this is has gotten, Intel uses the “U” in its core series for its lower TDP parts. Intel uses the U here for Uni-Processor or single socket-only configuration CPUs. Only two-thirds of the 1P only parts use the “U” though since the Platinum 8351N is also found on the Networking/ NFV “N” SKU list.

Intel Speed Select Technology Performance Profile 2.0, or SST-PP was previously found on the company’s “Y” SKUs which are again in this generation. We will have benchmarks of the Xeon Platinum 8352Y later in our performance section. At the same time, the “S” SKUs that have 512GB SGX enclave capacity also have SST-PP support, without the Y. The virtualization SKUs with the “V” have two models, but one, the Platinum 8352V retains the V while the Platinum 8358P is also a virtualization platform processor.

The Cloud players tend to have their own SKUs so we are not sure what this is, especially given the paltry 8GB SGX sizes. SGX is designed for confidential computing in areas such as cloud deployments, so this is especially strange positioning.

Intel naming conventions have gotten out-of-hand bad and are trending in an unmanageable manner with the 3rd generation Xeon Scalable. We have been hearing from customers and partners that this is making it very difficult to have SKU discussions.

There are a few notable omissions here. There is also no Xeon Bronze series in this generation. $501 is the entry price so Intel is effectively increasing the minimum price to buy into its platform by several hundred dollars. This completely makes sense and is where the industry is heading.

The other notable omission is in the frequency optimized parts. Intel has a maximum all-core turbo of 3.6GHz on the Xeon Gold 6354 and 6346. The maximum single-core turbo is the Gold 6334 and Platinum 8358Q (Q is for liquid-cooled) parts at 3.7GHz. In terms of clock speeds, the Platinum 8356H (Cooper Lake) had a base clock speed of 3.9GHz and went up from there. Intel is clearly focusing on platform and IPC boosts but the traditional frequency optimized per-core licensing SKUs (e.g. databases) are conspicuously absent.

Overall, there is a ton going on in this SKU stack, but it is amazing how many SKUs Intel has while spanning a 32 core gap between the Silver 4309Y and the Platinum 8380.

Next, it is time to get to the performance.

20 COMMENTS

Lasertoe April 6, 2021 At 8:33 am

@Patrick
What you never mention:
The competitor to ICL HPC AVX512 and AI inference workloads are not CPUs, they are GPUs like the A100, Intinct100 or T4. That’s the reason why next to no one is using AVX512 or
DL boost.

Dedicated accelerators offer much better performance and price/perf for these tasks.

BTW: Still, nothing new on the Optane roadmap.it’s obvious that Optane is dead.

Intel will say that they are “committed” to the technology but in the end they are as commited as they have been to Itanium CPUs as a zombie platform.
Patrick Kennedy April 6, 2021 At 8:55 am

Lasertoe – the inference side can do well on the CPU. One does not incur the cost to go over a PCIe hop.

On the HPC side, acceleration is big, but not every system is accelerated.

Intel, being fair, is targeting having chips that have a higher threshold before a system would use an accelerator. It is a strange way to think about it, but the goal is not to take on the real dedicated accelerators, but it is to make the threshold for adding a dedicated accelerator higher.
Lasertoe April 6, 2021 At 9:25 am

“not every system is accelerated”

Yes, but every system where everything needs to be rewritten and optimized to make real use of AVX-512 fares better with accelerators.

——————

“the inference side can do well on the CPU”

I acknowledge the threshold argument for desktops (even though smartphones are showing how well small on-die inference accelerators work and winML will probably bring that to x86) but who is running a server where you just have very small inference tasks and then go back to other things?

Servers that do inference jobs are usually dedicated inference machines for speech recognition, image detection, translation etc.. Why would I run those tasks on the same server I run a web server or a DB server? The threshold doesn’t seem to be pushed high enough to make that a viable option. Real-world scenarios seem very rare.

You have connections to so many companies. Have you heard of real intentions to use inference on a server CPU?
Patrick Kennedy April 6, 2021 At 9:30 am

Even Facebook is doing distributed inference/ training on CPUs. Organizations 100% do inferencing on non-dedicated servers, and that is the dominant model.
Patrick April 6, 2021 At 9:42 am

Hmmm… the real issue with using AVX-512 is the down clock and latency switching between modes when you’re running different things on the same machine. It’s why we abandoned it.

I’m not really clear on the STH conclusion here tbh. Unless I need Optane PMem, why wouldn’t I buy the more mature platform that’s been proven in the market and has more lanes/cores/cache/speed?

What am I missing?
Patrick April 6, 2021 At 9:55 am

Ahh okay, the list prices on the Ice Lake SKUs are (comparatively) really low.

Will be nice when they bring down the Milan prices. :)
Uzman77 April 6, 2021 At 10:16 am

@Patrick (2) We’ll buy Ice Lake to keep live migration on VMware. But YOU can buy whatever you want. I think that’s exactly the distinction STH is trying to show
Lasertoe April 6, 2021 At 10:26 am

I meant for new server application, not legacy like fb.

Facebook is trying to get to dedicated inference accelerators, like you reported before with their Habana/Intel nervana partnerships, or this:
https://engineering.fb.com/2019/03/14/data-center-engineering/accelerating-infrastructure/

Regarding the threshold: Fb is probably using dedicated inference machines, so the inference performance threshold is not about this scenario.
Y0s April 6, 2021 At 10:37 am

So the default is a single Lewisburg Refresh PCH connected to 1 socket? Dual is optional? Is there anything significant remaining attached to the PCH to worry about non-uniform access, given anything high-bandwidth will be PCIe 4.0?
Steffen April 6, 2021 At 11:07 am

Would be great if 1P 7763 was tested to show if EPYC can still provide the same or more performance for half the server and TCO cost :D
Thomas April 6, 2021 At 2:29 pm

Sapphire Rapids is supposed to be coming later this year, so Intel is going 28c->40c->64c within a few months after 4 years of stagnation.

Does it make much sense for the industry to buy ice lake en masse with this roadmap?
peter j connell April 6, 2021 At 2:39 pm

“… a major story is simply that the dual Platinum 8380 bar is above the EPYC 7713(P) by some margin. This is important since it nullifies AMD’s ability to claim its chips can consolidate two of Intel’s highest-end chips into a single socket.”

I would be leery of buying an Intel sound bite. I may distract them from focusing on MY interests.
Patrick Kennedy April 6, 2021 At 3:18 pm

Y0s – mostly just SATA and the BMC, not a big deal really unless there is the QAT accelerated PCH.

Steffen – We have data, but I want to get the chips into a second platform before we publish.

Thomas – my guess is Sapphire really is shipping 2022 at this point. But that is a concern that people have.

Peter – Intel actually never said this on the pre-briefs, just extrapolating what their marketing message will be. AMD has been having a field day with that detail and Cascade Lake.
peter j connell April 6, 2021 At 5:24 pm

I dont recall any mention of HCI, which I gather is a major trend.

A vital metric for HCI is interhost link speeds, & afaik, amd have a big edge?
Jorge April 6, 2021 At 6:27 pm

Patrick, did you notice the on package FPGA on the Sapphire Rapids demo?
emerth April 7, 2021 At 9:41 am

Patrick, great work as always! Regarding the SKU stack: call me cynical but it looks like a case of “If you can’t dazzle them with brilliance then baffle them with …”.
Lasertoe April 8, 2021 At 3:09 am

@Thomas

Gelsinger said: “We have customers testing ‘Sapphire Rapids’ now, and we’ll look to reach production around the end of the year, ramping in the first half of 2022.”

That doesn’t sound like the average joe can buy SPR in 2021, maybe not even in Q1 22.
Nils April 9, 2021 At 1:30 am

Is the 8380 actually a single die? That would be quite a feat of engineering getting 40 cores on a single NUMA node.
JayN April 11, 2021 At 4:56 pm

I was wondering about the single die, too. fuse.wikichip has a mesh layout for the 40 cores.
https://fuse.wikichip.org/news/4734/intel-launches-3rd-gen-ice-lake-xeon-scalable/
Thanny May 17, 2021 At 4:12 pm

What on earth is this sentence supposed to be saying?

“Intel used STH to confirm it canceled which we covered in…”

This site uses Akismet to reduce spam. Learn how your comment data is processed.

3rd Generation Intel Xeon Scalable “Ice Lake” SKU List

RELATED ARTICLESMORE FROM AUTHOR

This CXL Memory Controller Has 16 Arm Cores Marvell Structera A

This Chart is Key to Understanding the Growth of AI

This is the Microsoft Azure HBv5 and AMD MI300C

20 COMMENTS

LEAVE A REPLY

RELATED ARTICLES MORE FROM AUTHOR