Today, Arm is unveling its new Neoverse N3 and V3 cores along with its CSS offerings for both. The Neoverse N3 updates the N2 and the Neoverse V3 updates the V2 as one would expect. CSS is Arm’s compute subsystems that deliver more pre-packaged IP together to help companies get to chips or chiplets faster. One important note. New Arm cores being available does not equate to one being able to buy them in a product on shelves tomorrow. When we get a new Intel or AMD core, that is announced as available when it is in a product on the shelf (and has been pre-shipping for some time to select customers.) In Arm’s announcements, its IP is now available for customers to use to create chips. The competition for the Neoverse V3 and N3 from Intel and AMD will be future generation products.
New Arm Neoverse Cores, CSS, and Updated Roadmap
In the announcement, Arm focused on both Neoverse V3 and Neoverse N3 cores, but also expanding its CSS solutions to the V3 line after it launched with the N2 line.
Something Arm did not discuss on the pre-briefing in detail is that there are now E3 cores. They just get a mention on this slide.
Arm also gave us the codenames for next-gen CSS V-series (Vega) and N-series (Ranger) platforms, and what will presumably be the Neoverse V4 “Adonis” and N4 “Dionysus” products.
The small E-series box presumably for Neoverse E4 is Lycius.
Arm Neoverse N3 and CSS N3
Arm is putting a big effort on selling CSS solutions since it gets to sell more IP. Arm will have up to 32 cores and down to 8 cores and can get the 32-core version down to 40W TDP.
Arm says that the new solution is up to 20% more efficient than its N2 core on a performance per watt basis. Arm did not make end notes for this 20% claim.
We did not get a deep-dive into the Neoverse N3 core like we did for previous cores like the N1.
Arm Neoverse V3 and CSS V3
The Arm Neoverse CSS V3 is really interesting. First, the performance claim is a 50% increase in per-socket performance, but that does not account for power. So moving from the smaller efficient N-series core to the larger V-series core, and without a power limit. This claim, however, did not have an end note on how it was measured.
The Neoverse CSS V3 is 64 cores per cluster and up to 128 cores per socket and supports modern features like PCIe Gen5, CXL 3.0, and even HBM3. We do not know, for example, if the HBM3 support was used to get the 50% claim above because Arm did not say how that figure was reached.
One of the big features is that Arm says that it can help provide a NVIDIA Grace Hopper style compute platform for its customers if they have their own AI accelerator. Arm’s goal is to make the CPU compute side easy with CSS V3, albeit with fewer cores than NVIDIA’s 72 core Grace hemisphere.
Still, the idea for cloud providers is that they can work on the AI accelerator IP and have an easy path to adding Arm cores. Frankly, AMD and Intel need to emulate this capability in the chiplet era.
Arm Neoverse V3 and N3 Performance
Arm says that with core upgrades and software optimizations it can achieve big gains in things like xgboost.
In simulation, it can achieve more performance across the board. Here we notice that the gains are often in the 9-16% for Arm Neoverse V2 to Neoverse V3 and 9-30% for the Neoverse N2 to Neoverse N3 gains. The outlier is the work that Arm put into xgboost which is the AI data analytics.
Here is the generational comparison Arm is doing versus Intel and AMD.
Here are the end notes for that. Note some is normalized. We will let folks read these:
Here is Arm’s research into doing generative AI on its CPUs.
That 23% uplift falls in-line with a normal generational uplift for a server CPU.
Arm Chiplet System Architecture
Arm Chiplet System Architecture, or CSA, is really the design for using Arm compute chiplets along with chiplets from other IP sources.
We have discussed chiplets for years on STH, and Arm also sees the benefit so it is making this easier for its customers.
At some point, we expect AMD and Intel will explore selling their x86 chiplets to be integrated into other packages, so this makes sense that Arm is pushing this. It is an area they have an advantage, but it also feels like Arm should be producing 64 core Neoverse V3 chiplets as part of CSS and then giving them to customers to further cut down design cycles if so many customers are interested in CSS.
Final Words
Arm is getting there. It now says that 80% of graduated CNCF projects natively support Arm. CNCF projects tend to be the tip of the spear when looking at overall software compatibility, and they are the ones with a large install base. Still, many are not deploying Arm even today because they want 100% not 80% for servers they deploy. NVIDIA is poised to make the big change in this market as it will start favoring its CPUs for AI workloads. That is the type of shift that will move the needle.
Overall though, thinking back to using the first real server CPU cores with the Cavium ThunderX in 2016, the Arm ecosystem has come a long way and continues to evolve which is great for the industry.