XiangShan High-Performance RISC-V Processors at Hot Chips 2024

2
XiangShan Hot Chips 2024_Page_08
XiangShan Hot Chips 2024_Page_08

XiangShan is a RISC-V CPU project out of China, and now hosted on Github. This is a high-performance CPU design, instead of lower performance designs that we have seen from others.

Please note that we are doing these live at Hot Chips 2024 this week, so please excuse typos.

The chip project is a RISC-V project run by folks at Chinese universities.

XiangShan Hot Chips 2024 Cover Large
XiangShan Hot Chips 2024 Cover Large

XiangShan High-Performance RISC-V Processors at Hot Chips 2024

The vision is that this is a RISC-V architecture with open tools and designs.

XiangShan Hot Chips 2024_Page_03
XiangShan Hot Chips 2024_Page_03

This is also designed as a high-performance rather than a low-performance core.

XiangShan Hot Chips 2024_Page_04
XiangShan Hot Chips 2024_Page_04

XianShan is targeting the Arm Neoverse N2 with Kunminghu and the Arm Cortex A76 with the Nanhu line, and these are being built.

XiangShan Hot Chips 2024_Page_05
XiangShan Hot Chips 2024_Page_05

In addition to cores, the goal is to allow for larger clusters for higher-performance chips. That also includes integrating both the Kunminghu and Nanhu core. The project also includes the coherent interconnect and other parts of the left side of the diagram.

XiangShan Hot Chips 2024_Page_06
XiangShan Hot Chips 2024_Page_06

Here is a look at the Kunminghu microarchitecture. This includes vector and hypervisor extensions.

XiangShan Hot Chips 2024_Page_07
XiangShan Hot Chips 2024_Page_07

Here is a look at the branch predictors and the instruction code/ TLB on the frontend.

XiangShan Hot Chips 2024_Page_08
XiangShan Hot Chips 2024_Page_08

There is a 6-wide decode/ rename/ dispatch on the backend.

XiangShan Hot Chips 2024_Page_09
XiangShan Hot Chips 2024_Page_09

The integer block is a 4 ALU design and the design also has floating point and vector blocks as well.

XiangShan Hot Chips 2024_Page_10
XiangShan Hot Chips 2024_Page_10

Here is a look at the memory block with the load-store pipeline, MMU, and data cache.

XiangShan Hot Chips 2024_Page_11
XiangShan Hot Chips 2024_Page_11

Cores have private L2 caches up to 1MB and there is a shared L3 cache. 16MB shared L3 may look small compared to modern large server CPU designs, it is fairly good if you think in terms of a Neoverse N2 device.

XiangShan Hot Chips 2024_Page_12
XiangShan Hot Chips 2024_Page_12

Here is the pipeline diagram across the 13 stage pipeline.

XiangShan Hot Chips 2024_Page_13
XiangShan Hot Chips 2024_Page_13

Here are some of the highlights that we will let you read through.

XiangShan Hot Chips 2024_Page_14
XiangShan Hot Chips 2024_Page_14

Here is a look at the Kunminghu and Nanhu versus the Arm Neoverse N2 and Arm Cortex A76.

XiangShan Hot Chips 2024_Page_15
XiangShan Hot Chips 2024_Page_15

In terms of performance, SPEC CPU2006 is a bit older, but it is fairly well-known.

XiangShan Hot Chips 2024_Page_16
XiangShan Hot Chips 2024_Page_16

Here is a real chip, board, and a real video running on the chip.

XiangShan Hot Chips 2024_Page_17
XiangShan Hot Chips 2024_Page_17

Beyond the first two architectures, there are additional open source tools going along with the initial designs.

XiangShan Hot Chips 2024_Page_18
XiangShan Hot Chips 2024_Page_18

Minjie is the agile development toolchain being used.

XiangShan Hot Chips 2024_Page_19
XiangShan Hot Chips 2024_Page_19

Here is a look at the difftest, which is used to find the RTL errors in a timely manner.

XiangShan Hot Chips 2024_Page_21
XiangShan Hot Chips 2024_Page_21

There is also LightSSS, which reproduces debug information in simulation. These are not things that end users would normally run. Instead, these tools help build usable chips through testing before going to fabs.

XiangShan Hot Chips 2024_Page_22
XiangShan Hot Chips 2024_Page_22

Here are some of the highlighted collaborations, including for a server CPU, a 5nm AI acceleration chip, and a 7nm DPU.

XiangShan Hot Chips 2024_Page_23
XiangShan Hot Chips 2024_Page_23

Here is the summary and some of the roadmap.

XiangShan Hot Chips 2024_Page_24
XiangShan Hot Chips 2024_Page_24

With two teams running in parallel, the project hopes to keep creating new designs and tape out a new chip every year.

Final Words

It is cool to see what is essentially two RISC-V projects out of China directly targeting the performance and product segments of two Arm CPUs. We often see customized RISC-V designs, but these are more general-purpose chips.

2 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.