This week, Marvell launched a new line of PCIe Gen6 and CXL 3.x retimers that are already sampling. The new Marvell Alaska P series of PCIe retimers are based on 5nm process technology, decreasing power consumption. These new chips are going to be entering an increasingly competitive market.
Marvell Alaska P PCIe Gen6 Retimers Launched
Regarding Marvell Alaska, we have seen the line in products we have reviewed for over a decade. Some of those same parts are even in more modern systems we review. Now, the Alaska P line is aiming to fill a more involved role in future servers, fixing a huge challenge.
For those that do not know, if you attend a Marvell event, like the Marvell AI Investor Day 2024 you will hear a lot about PAM4, SerDes, DSPs, and more. The company has been using the PAM4 transition as an inflection point across its portfolio. PAM4 is important because it moves the industry from NRZ, and that has big impacts in the PCIe Gen6 era.
Taking a look at why, Marvell has a great chart on the reach of PCIe signaling over standard PCB. With PCIe Gen3, we commonly saw PCB between the CPU and risers. When PCIe Gen4 servers started coming out, we noted in reviews how we were seeing more cabling between CPUs and risers, and that it would be more common in the future. With PCIe Gen5, we saw a halving of the distance signals would travel. If you look at modern servers, that is why we commonly see PCIe cables to risers.
If you look at a modern AI server, the PCIe connectivity is getting wild. Not only have the distances that PCIe signal integrity will survive in standard PCB shrunk, but the number of PCIe devices is increasing, and they cost a lot.
Just based on the physical size of the components and cooling solutions, and the number of devices in systems, longer distances must be traversed. Hence why the PCIe retimer market is heating up (Astera Labs has ridden the PCIe retimer market to a ~$10B valuation.)
PCIe Gen5 to Gen6 is not as big of a drop-off in distance, because of changes like the switch from NRZ to PAM4. Still, with distances going up due to larger physical systems, and the reach getting shorter, PCIe retimers are becoming more important.
Now for the big one: instead of a 3.5-inch reach, Marvell sees a path to active optical cables (AOCs) that can span 30 meters using its components.
For standard lower-cost DACs, it seems that 3m is reasonable, enough for in-rack communication. Active Electrical Cables bring the PCIe retimer onto the cable module for a 7m reach, or enough for adjacent rack cabling.
Here is the block diagram for the Alaska P 16-lane retimer.
We also found a 2023 TE connectivity demo apparently using the Marvell PCIe Gen6 retimers so when the company says its retimers are sampling, that makes sense.
Marvell also says it runs at around 10W typical, which is lower than both the Astera Labs Aries 6 and Broadcom Vantage 6 retimers.
Final Words
It is fairly easy to just look at today’s NVIDIA HGX H100/ H200 servers that generally use Astera Labs Aries 5 PCIe retimers and think that AI servers are the only market. On the other hand with PCIe Gen6, and what we expect to be a fast-following PCIe Gen7, the reach is going to continue to shorten. Further, we are just starting to see the first CXL memory devices today. In the future, having shelves with memory is going to happen, and those applications will need CXL switches and retimers. The same will continue to happen with AI accelerators. Further, NVMe needs to displace SAS in larger arrays. All of these applications mean that the CXL retimers will become a big component in future infrastructure. That is why the number of players in the CXL retimer market is increasing rapidly, with one more that we expect to enter the market next week.
Given we know one more PCIe Gen6 retimer will enter the market next week, we are going to have a new Axautik Group research short for the PCIe retimer market next week. Our first Axautik Group LLC Research Short comparing the Astera Labs and Broadcom releases was successful, so we are going to build something that will cover more than just those two. Stay tuned for that next week.
This is starting to get crazy. I get that 10w is lower than some of the competition but at 10w per 16 lanes this seriously adds up power usage for a modern server.
With a CPU that has 128 lanes, reserving some for storage and networking; you are looking at a budget of what 50w-70w just for rețimers to get the signal where it needs to be.
Am I missing something here? Are we going to need to develop a new high speed lower power interconnect?
George – The impact of power consumption can be significantly more than 50-70W. Imagine if even every AI accelerator in an 8-way system needed a retimer. Add a few more for perhaps CPUs getting to CXL switches for shared memory shelves. This can easily be a 100W+ impact in a server. Of course, at 10kW+ per server there are folks who look to save every watt, but others that look at 10W as a rounding error.