Intel Lunar Lake for AI PCs at Hot Chips 2024

4
Intel Lunar Lake Hot Chips 2024_Page_03
Intel Lunar Lake Hot Chips 2024_Page_03

With the release coming in a few days, we had a talk at Hot Chips 2024 on Intel Lunar Lake. This is the company’s next-generation AI PC part for mobile. Intel is doing a lot of integration and making some major changes compared to Meteor Lake, so this is going to be a significant generational move.

Please note that we are doing these live at Hot Chips 2024 this week, so please excuse typos.

Intel Lunar Lake for AI PCs at Hot Chips 2024

Lunar Lake is going to be released soon, but the idea is to have better performance and performance per watt than Meteor Lake.

Intel Lunar Lake Hot Chips 2024_Page_02
Intel Lunar Lake Hot Chips 2024_Page_02

Intel is using different process nodes for different tiles here, which is becoming more common. Something big in this generation is the inclusion of memory on-chip. This is like designs from companies like Apple, NVIDIA, and some high-end HPC processors where the memory is integrated, rather than being in LPCAMM, SODIMM, or DIMM form.

Intel Lunar Lake Hot Chips 2024_Page_03
Intel Lunar Lake Hot Chips 2024_Page_03

Intel is only going up to 32GB. One of the challenges is that Intel has to buy the memory from another vendor, which lowers the chip’s margins. 64GB chips will have a lower margin than 32GB because more cost is lower margin DRAM. Apple, of course, charges silly amounts for additional memory so it manages to get great margin due to vertical integration in its systems. This is one of those interesting areas where we can see competition and financials in the industry impacting the performance.

Intel Lunar Lake Hot Chips 2024_Page_04
Intel Lunar Lake Hot Chips 2024_Page_04

Next, Intel is getting into the SoC structure, including the chiplets and compute tile. Here, we can see the four P cores with 3MB of cache, IPU, GPU, memory subsystem, and more all on the same die.

Intel Lunar Lake Hot Chips 2024_Page_07
Intel Lunar Lake Hot Chips 2024_Page_07

Lunar Lake has a memory side cache. This is an 8MB physical cache, meant to reduce DRAM traffic.

Intel Lunar Lake Hot Chips 2024_Page_08
Intel Lunar Lake Hot Chips 2024_Page_08

The E-core cluster now has 4MB of L2 cache, its own power delivery and so forth. E-cores are not the old Atom cores. They are quite fast these days. There are now four E-cores here instead of two in Meteor Lake. The memory side cache is being used here as well for higher performance.

Intel Lunar Lake Hot Chips 2024_Page_09
Intel Lunar Lake Hot Chips 2024_Page_09

A big part of Lunar Lake is the power delivery and management. There are now 4 PMICs for the SoC. The PMICs are allegedly the same but replicated and managed independently.

Intel Lunar Lake Hot Chips 2024_Page_10
Intel Lunar Lake Hot Chips 2024_Page_10

These PMICs help optimize the SoC’s power delivery and efficiency. The E-core cluster is designed to handle most workloads today as they have become faster. Intel also says that the sleep states are much faster to get in and out of.

Intel Lunar Lake Hot Chips 2024_Page_11
Intel Lunar Lake Hot Chips 2024_Page_11

Intel still has things like Thread Director as well as workload classification to place workloads on the right core resources.

The Lion Cove P-Core is a major change. This feels like one of, if not the, biggest change for Intel in a long time.

Intel Lunar Lake Hot Chips 2024_Page_14
Intel Lunar Lake Hot Chips 2024_Page_14

Intel says the new design database is modernized to help transition to different processes and among different designs and power envelopes.

Here are the high-level highlights of Lion Cove. Intel says that they have gone through performance and efficiency optimizations. The biggest change might be that SMT or Hyper-Threading has been removed.

Intel Lunar Lake Hot Chips 2024_Page_15
Intel Lunar Lake Hot Chips 2024_Page_15

Intel says that the new core is driving around a 14% performance gain on an IPC basis. that is important since clock speeds may be different. It also says that it can deliver double-digit performance per watt compared to the previous generation.

Intel Lunar Lake Hot Chips 2024_Page_16
Intel Lunar Lake Hot Chips 2024_Page_16

Intel is optimizing a lot on the lower end of the power scale. Intel says removing Hyper-Threading helped them get more efficient at low power.

Here we also have the Skymont E-core, designed to take over more workloads.

Intel Lunar Lake Hot Chips 2024_Page_18
Intel Lunar Lake Hot Chips 2024_Page_18

Here are the highlights of Skymont:

Intel Lunar Lake Hot Chips 2024_Page_19
Intel Lunar Lake Hot Chips 2024_Page_19

On the low-power island, the Lunar Lake to Meteor Lake delta is huge. Intel says this is the microarchitecture, but also caches, system latencies, and more, so this is not just straight microarchitecture gains.

Intel Lunar Lake Hot Chips 2024_Page_20
Intel Lunar Lake Hot Chips 2024_Page_20

Here is the performance per watt or performance for the power plot for the new low-power island E-core. Something to note is that the power line goes longer than the Meteor Lake one.

Intel Lunar Lake Hot Chips 2024_Page_21
Intel Lunar Lake Hot Chips 2024_Page_21

Here is a look at latency for the different parts of Lunar Lake and Meteor Lake cores.

Intel Lunar Lake Hot Chips 2024_Page_23
Intel Lunar Lake Hot Chips 2024_Page_23

Here is the core-to-core latency look. Those are better than what we have seen in some recent generations of server chips.

Intel Lunar Lake Hot Chips 2024_Page_24
Intel Lunar Lake Hot Chips 2024_Page_24

Here are the two curves for the E-cores and P-cores in Lunar Lake.

Intel Lunar Lake Hot Chips 2024_Page_25
Intel Lunar Lake Hot Chips 2024_Page_25

This is an example of how Microsoft Teams are being used. Teams needs to go to P-cores on Meteor Lake.

Intel Lunar Lake Hot Chips 2024_Page_26
Intel Lunar Lake Hot Chips 2024_Page_26

On Lunar Lake, Intel thinks it can keep everything on E-cores.

Intel Lunar Lake Hot Chips 2024_Page_27
Intel Lunar Lake Hot Chips 2024_Page_27

Intel says its new GPU architecture, Xe2 will go to client SoC iGPU as well as in dGPU designs.

Intel Lunar Lake Hot Chips 2024_Page_29
Intel Lunar Lake Hot Chips 2024_Page_29

At the heart of the Xe2 cores is the vector engine. This has gone from two SIMD8 to a single SIMD16 structure.

Intel Lunar Lake Hot Chips 2024_Page_30
Intel Lunar Lake Hot Chips 2024_Page_30

Here is the summary on the new Xe2 GPU that it says can be 1.5x faster gaming performance at the same power.

Intel Lunar Lake Hot Chips 2024_Page_31
Intel Lunar Lake Hot Chips 2024_Page_31

Here is Intel’s performance efficiency curves. A key change is that by scaling from the low end to the high-end. Meteor lake U and Meteor Lake H had to use different engines, but the new Xe2 can cover the spectrum.

Intel Lunar Lake Hot Chips 2024_Page_32
Intel Lunar Lake Hot Chips 2024_Page_32

Intel showed a Stable Diffusion demo with Lunar Lake versus Meteor Lake.

Intel Lunar Lake Hot Chips 2024_Page_33
Intel Lunar Lake Hot Chips 2024_Page_33

On the media side, h266 has been added.

Intel Lunar Lake Hot Chips 2024_Page_35
Intel Lunar Lake Hot Chips 2024_Page_35

Intel says that VVC decode is much lower with the new media engine.

Intel Lunar Lake Hot Chips 2024_Page_36
Intel Lunar Lake Hot Chips 2024_Page_36

NPUs are hot topics. In this generation the NPU is getting larger and has higher clock speeds.

Intel Lunar Lake Hot Chips 2024_Page_38
Intel Lunar Lake Hot Chips 2024_Page_38

Here are the key points of the Intel NPU 4. This has gone from 2 to 6 neural compute engines. Intel says this is 48 TOPS just on the NPU.

Intel Lunar Lake Hot Chips 2024_Page_39
Intel Lunar Lake Hot Chips 2024_Page_39

Here is the performance of the new NPU. Note: the NPU 4 is designed to also use more power at its peak.

Intel Lunar Lake Hot Chips 2024_Page_40
Intel Lunar Lake Hot Chips 2024_Page_40

Here is the connectivity slide with WiFi 7 and more.

Intel Lunar Lake Hot Chips 2024_Page_42
Intel Lunar Lake Hot Chips 2024_Page_42

Here is the summary slide for the new platform.

Intel Lunar Lake Hot Chips 2024_Page_43
Intel Lunar Lake Hot Chips 2024_Page_43

This is going to be a big change.

Final Words

This is super exciting to see just how far Intel is going on the Lunar Lake upgrade over Meteor Lake. This feels like an enormous shift in architecture. We should start seeing designs soon.

4 COMMENTS

  1. “The Lion Cove P-Core is a major change. This feels like one of, if not the, biggest change for Intel in a long time.”

    What? And the E core just gets this?

    “Here we also have the Skymont E-core, designed to take over more workloads.”

    The P core gets a mere 14% gain and the E core is a huge gain. I barely read about Lion Cove considering the low gains.

  2. What the heck is VVC? I see it mentioned, but absolutely no explanation of what that is appears to have been given. Or, maybe I’m going blind and it’s there but I can’t, or don’t, see it.

  3. VVC (Versatile Video Encoding) is H.266, which is the latest and greatest video compression codec with gains that are higher than AV1, although at even greater complexity.

    Given the hostility of the H.265/H.266 Licensing Alliance, it’s doubtful, IMO, that we’ll ever see hardware encode in consumer devices. Decode is useful though as content publishers may adopt it, although the biggest ones seem to be keen on AV1 given the royalty free nature of it.

    AV1 may appear on consumer hardware encoders, but for home use H.265 is pretty good and doesn’t have crazy requirements, so you may as well use.

    IIRC both Nvidia and Intel have AV1 encoders that are pretty good, but Intel’s implementation basically matches their H.265 encoder in terms of quality at the same bitrate, so there’s not much point in using it.

  4. Skymont is giving off “why don’t they make the entire plane out of black boxes” vibe. If they shipped a 32-core Skymont processor for workstations I would buy it.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.