More Details on the AMD Ryzen AI 300 Series Ryzen 9000 Series and Zen 5

2
AMD Zen 5 Architecture Strix Point SoCjpg
AMD Zen 5 Architecture Strix Point SoCjpg

AMD heard the feedback that many wanted a bit more on the new AMD Ryzen AI 300 series, Ryzen 9000 CPUs, and Zen 5, and so we have some more to share. That includes functional block diagrams, a few PCIe lane changes, and even areas the Zen 5 cores improved. Let us get into it quickly.

AMD Zen 5 and Zen 5c Optimization

Something cool that we got from AMD this week was the Zen 4 to Zen 5 comparison. We will have some detailed slides below, but this is a great summary of what changes in the processor impacted the Zen 5 performance.

AMD Zen 5 Architecture Performance Gain Drivers
AMD Zen 5 Architecture Performance Gain Drivers

Here is a bit more on the Zen 5 speeds and feeds and why this processor is faster than Zen 4.

AMD Zen 5 Architecture Core Complex Speeds And Feeds
AMD Zen 5 Architecture Core Complex Speeds And Feeds

Here are the new Zen 5 ISA improvements and instructions:

AMD Zen 5 ISA Additions
AMD Zen 5 ISA Additions

The PREFETCH[I*] is to help prefetch into the ICache. Previously, AMD had a better way to do this into the DCache.

With the AMD Ryzen AI 300 series, we have heterogeneous Zen 5 and Zen 5c cores. Zen 5c trades off a smaller L3 cache and top-end frequency for better power efficiency.

AMD Zen 5 And Zen 5c Optimizations In Heterogeneous SoC
AMD Zen 5 And Zen 5c Optimizations In Heterogeneous SoC

That heterogeneous SoC is relevant in the context of the AMD Ryzen AI 300 series.

AMD Ryzen AI 300 Series Update

AMD sent this Strix Point SoC diagram. Here, we can see some details from our previous piece, including how the Zen 5 and Zen 5c complexes need to go through the on-chip fabric to reach the other cluster’s L3 cache.

AMD Zen 5 Architecture Strix Point SoCjpg
AMD Zen 5 Architecture Strix Point SoCjpg

One of the more interesting points AMD made was that the PCIe lanes are down from 20 to 16. Apparently, the attach rate of utilizing those extra lanes was not high. The common use case was adding another M.2 SSD, but that was infrequently used.

In our previous Architecture Trifecta AMD Zen 5 RDNA 3.5 and XDNA 2 piece, we discussed the new integrated GPU. AMD said that while they leveraged learning from the mobile IP licensing business, the bulk of the performance comes from building a larger engine.

AMD RDNA 3.5 Architecture
AMD RDNA 3.5 Architecture

Here is the slide on XDNA 2 again:

AMD XDNA 2 Architecture
AMD XDNA 2 Architecture

Next, let us get to the AMD Ryzen 9000 series.

AMD Ryzen 9000 Series Update

The AMD Ryzen 9000 series SoC utilizes homogeneous Zen 5 core complexes. At the same time, not everything is new. It appears as though AMD recycled the previous generation I/O die.

AMD Ryzen 9000 Granite Ridge Zen 5 SoC Architecturejpg
AMD Ryzen 9000 Granite Ridge Zen 5 SoC Architecturejpg

As such, we are going to expect the AMD Ryzen 9000 series to mostly be helped by the Zen 5 improvements, so next we have the detail slides.

AMD Zen 5 Microarchitecture Update

We already showed the key levers AMD used to increase Zen 5 performance, but AMD went into a bit more details. One on this slide is that Zen 5 was designed for 4nm and 3nm process nodes.

AMD Zen 5 Architecture Design Objectives
AMD Zen 5 Architecture Design Objectives

Here is the block diagram of the Zen 5 core and the key overview items.

AMD Zen 5 Architecture Microarchitecture Overview
AMD Zen 5 Architecture Microarchitecture Overview

We are going to let folks read through the slides rather than repeating them. Here is the optimized branch prediction and fetch.

AMD Zen 5 Architecture Branch Prediction And Fetch
AMD Zen 5 Architecture Branch Prediction And Fetch

AMD has new OpCache storage, and dual decode pipes.

AMD Zen 5 Architecture Decode Advancements
AMD Zen 5 Architecture Decode Advancements

The dispatch has grown to 8 wide and has a bigger execution window.

AMD Zen 5 Architecture Wider
AMD Zen 5 Architecture Wider

AMD also spent a lot of time on increasing the bandwith within the cores.

AMD Zen 5 Architecture Load Store
AMD Zen 5 Architecture Load Store

On the floating point side, perhaps the big one is that AMD has a new AVX512 with full 512b datapath.

AMD Zen 5 Architecture VP Vector
AMD Zen 5 Architecture VP Vector

The company says it has figured out a way to feed the AVX512 engines without having to resort to dramatic clock speed reductions.

Final Words

Part of the reason we are showing this is becuase the AMD Ryzen 9000 series (and perhaps an update one day to the EPYC 4000 series?) and the Ryzen AI 300 series use the new cores. The other part is looking to the future with Turin and Turin Dense in Q4.

AMD Zen 5 And Zen 5c Uses
AMD Zen 5 And Zen 5c Uses

It will be interesting to see how AMD’s and Intel’s approaches to lower power cores emerge. For example, AMD calls SMT one of its most effective performance per watt tools. At least this year is shaping up to be a lot of fun on the CPU side.

2 COMMENTS

  1. I would like to see all OEMs utilize the integrated USB4 controllers, because they have no excuse not to, other than artificial market segmentation.

  2. Typo in the PCI lanes section. Quoting:

    “One of the more interesting points AMD made was that the PCIe lanes are down from 20 to 16. Apparently the attach rate of utilizing those extra lanes.”

    I am not sure what is missing.

    Otherwise good article, thanks!

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.