Snapdragon X Elite Qualcomm Oryon CPU Design and Architecture Hot Chips 2024

1
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_07
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_07

At Hot Chips 2024, we learned more about the Qualcomm Oryon CPU in the Snapdragon X Elite. For those who have not see, the Qualcomm Snapdragon X Elite is the company’s foray into Arm-based PC SoCs. Let us get to it.

Please note that we are doing these live this week at Hot Chips 2024, so please excuse typos.

Snapdragon X Elite Qualcomm Oryon CPU Design and Architecture Hot Chips 2024

The Qualcomm Oryon is the company’s CPU powering the Snapdragon X Elite SoC. This is the Arm-based core from the Nuvia team. The clusters are the same here, but they are operated in different manners for power purposes.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_03
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_03

The CPU core areas that Qualcomm is highlighting are the Instruction Fetch Unit (IFU), Vector Execution Unit (VXU), Rename and Retire Unit (REU), Integer Execution Unit (IXU), Memory Management Unit (MMU), and Load and Store Unit (LSU.)

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_04
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_04

Here are the specs for the fetch and decode for Oryon. The 13 cycle branch mispredict latency is not industry best, but Qualcomm says it is “balanced” for the design.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_06
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_06

Here is the fetch pipeline for the chip. After the decoding, the instructions move to a 600+ entry re-order buffer. The decoders can handle every instruction class in the architecture.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_07
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_07

Here is the rename dispatch and execution specs. The register files are physical register files at around 400 entries. Integer is 6-wide, vector is 4-wide, and load-store is 4-wide as well. Each pipe is 128 bits on the Vector Execution Pipe side. It supports almost every data type.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_08
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_08

Here is the instruction execution pipeline in a picture format. ALUs and shifters are in all of the execution units. We can also see the portion that transfers to the vector units.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_09
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_09

This is the vector execution pipeline. We can see the portion that transfers to the integer execution side as well.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_10
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_10

Here are the load store specs. Qualcomm is using a standard 16-bit cell here. There can be over 200 in-flight load-store operations. Prefetching is very important here, so there is a mix of proprietary and industry prefetchers. These are applied to caches and translation structures.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_11
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_11

Here is the memory system hierarchy. There is a relatively large L2 cache. Each reservation station has 64 entries. The cache tends to operate at around core frequency with low latency. The average is 15-20 clocks to the L2 cache.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_12
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_12

Here are the memory management unit specs. Not on the slide, but there are somewhere between 10-20 in-flight operations per clock.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_13
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_13

Here is the memory subsystem. Notable is that there is a relatively small system-level cache at only 6MB. The 6MB cache can be used by all of the engines in the SoC.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_14
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_14

Here are some of the security features since these SoC’s are meant for notebooks (that can be stolen or misplaced fairly easily.)

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_15
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_15

Qualcomm’s per-core performance is very good and is a far cry from something like an Arm Neoverse N series or AmpereOne core. Also interesting is that Geekbench 6 performs better on Linux and even Windows Subsystem for Linux than it does for Windows natively. The same is seen on the SPEC CPU2017 results.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_17
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_17

Here is the memory system latency chart. The large transition just over half-way to the right is the 12MB L2 cache latency transition.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_18
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_18

Here is the memory bandwidth chart using a single thread. The single core is able to be fed at just under the 100GB/s range, which is significant given the 135GB/s platform bandwidth from the LPDDR5x memory.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_19
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_19

Qualcomm is looking to scale Oryon beyond just notebooks.

Qualcomm Snapdragon X Elite Hot Chips 2024_Page_20
Qualcomm Snapdragon X Elite Hot Chips 2024_Page_20

Notably missing here is a server.

Final Words

We did not cover the Snapdragon X Elite launch because after using things like the old Arm-based Surface, I think it is worth waiting a generation or two to jump in. We have been using Windows on Arm since the previous generation developer kit, on which we decided not to do a review. Of course, I am typing this on an Apple Macbook Pro M1, so perhaps I do not always follow that advice. Still, as the ecosystem matures around the Qualcomm Snapdragon systems, we will start covering them more often.

It is somewhat fascinating that there are no mentions of Arm on the slides. Perhaps that is due to litigation between Arm and Qualcomm.

1 COMMENT

  1. Where does -the rest of the cache- go? Lots of registers and 6 MB here and there is fine, but 1994’s like …ok 16 MB more here and there?

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.