Despite the “death of the mainframe” being a headline for many years, IBM is still making new generations of mainframes. Large customers running mission-critical workloads still want this level of technology, and pay top dollar for it. The latest IBM Z15 mainframe shows just how awesome the design of these systems gets. We get to cover this at Hot Chips 32 (2020.) We are going to update this piece as new details emerge over the talk.
IBM Z15 Mainframe Overview
A mainframe is designed differently than we see typical commodity x86 servers in many aspects. One of them is simply the way the processor complexes are constructed and delivered. A great example of this is how there is a system control chip shared between processors with a massive L4 cache. Abstracting this a bit, it is somewhat analogous to how AMD EPYC 7002 “Rome” is constructed with a chip that has a shared last-level cache. Then again, IBM z15 is bigger.
One of the cooler features here is that the processor chip runs at 5.2GHz on 14nm process. These are not like mainstream processors so IBM has a lot more leeway in terms of higher TDP levels and cooling methodology. The IBM z15 is the company’s second 14nm part like the z14. As one can see, IBM is following Intel in sticking on 14nm longer than other nodes.
We are going to note, that we covered IBM POWER10 earlier today.
IBM z15 Drawer Level Design Overview
Within the IBM Z15, processors are not designed to operate alone. Instead, there are up to four processor chips per drawer. Each drawer also has a system control chip with the L4 last level cache and to handle an interconnect. One can get up to five drawers.
This is a view of what the chipset and drawer look like:
On the top we have I/O above that system controller chip. Overall, this allows for up to 240 cores, 40TB of memory, and 60x PCIe x16 cards in the complete system.
IBM z15 Processor Design Overview
Each IBM z15 processor chip can have up to 12 cores. These cores have 4MB L2 cache and 256MB of shared L3 cache. IBM increasing a lot of the microarchitectural features versus the previous 14nm z14 design.
As one may imagine, IBM has a deep pipeline that is designed to operate at very high frequencies. Compared to Intel that is using much lower frequencies on its Intel Ice Lake-SP Next-Gen Xeon Architecture, IBM is using fewer cores that run at higher frequencies yet that still do a lot of work per core.
Another feature of the design is the new accelerators. One is the integrated deflate accelerator:
This is a hybrid LZ encoder and compressor. Compressing data helps the systems various memory and storage hierarchies be used more efficiently. With hardware-acceleration, these features can be added with less overhead. That hardware accelerator was previously a FPGA. That means the clock speed has gone from a few hundred megahertz to 5GHz+.
Since many of these systems are used at banks and other financial institutions, cryptography is another key feature of the z15.
IBM z15 has modulo arithmetic (ECC) acceleration built-in. This increases performance which means that this encryption can be used with minimal overhead.
Something we wanted to highlight on this slide is that the 11.4x-21.8x speedup is for the on-chip z15 accelerator versus the “z14 using a Crypto Express6 PCI accelerator.” That frees up expansion slots for other uses while also increasing performance.
Final Words
We are unlikely to review a z15 at STH. Still, seeing this awesome hardware is always interesting. Hopefully, our readers will enjoy the quick look at the IBM z15.
where are these chips fabbed at? The former IBM fabs that were sold to GloFo and then to ON Semi?
Wild guess, GloFo in NY?
Well if IBM won’t send you a review unit how can we know what we mere mortals are missing out on?
I wish you to get this beast for a review. That would be awesome but useless I guess
Practically everyone(*) can get a free Linux virtual machine running on real IBM Z (or IBM LinuxONE) machines from the LinuxONE Community Cloud for up to 120 days, currently (as I write this) with the latest z15 generation processors as I understand it. (Easy enough to check from your VM.) Details here:
https://developer.ibm.com/linuxone
Direct registration link here:
https://linuxone.cloud.marist.edu/#/register?flag=VM
These are VMs, meaning you won’t have the whole machine to yourself. But it’s the really interesting hardware underneath, and did I mention it’s free? Enjoy.
(*) Residents of North Korea are likely among the few that unfortunately cannot.
That’s a lot of effort in the cores to increase single thread performance by 14%. Multicore performance improves 25% mostly because of 20% more cores (12 up from previous 10).
New instructions and additional accelerators built in for specific tasks. Still has the big fixed-point decimal maths for banks and other financials use. The coherent caches help these tasks to scale across many cores: keep the books balanced and the lights on.
Any idea of the power consumption/TDP of this architecture?