Broadcom is one of the companies behind many custom AI accelerators. As such, it is a big deal that the company is showing off an AI compute ASIC with optical attach during Hot Chips 2024, as this is likely being done for a customer project. Broadcom makes custom AI accelerators for hyper-scalers. The company also showed off its co-packaged optics and silicon photonics which we love to see.
Please note that we are doing these live at Hot Chips 2024 this week, so please excuse typos.
Broadcom AI Compute ASIC with Optical Attach Detailed at Hot Chips 2024
Co-packaged optics have been a topic for a long time.
One of the big challenges in interconnects and networking is just the electrical I/O reach through PCB. There are designs using cables to go externally from the switch chip to the optical cages, but the endgame still feels like it is optics. On the other hand, the NVIDIA NVLink NVL72 is a great example of copper rather than optical attach.
Silicon Photonics makes for optical modules with fewer components.
Broadcom is now using silicon photonics and co-packaged optics for not just switches, but also for scale-up compute.
For those wondering, that switch photo is the Broadcom Tomahawk 5 Bailly.
Here is the CPO schematic. An important difference between Intel’s old silicon photonics and what Broadcom is doing, is that Broadcom is using a pluggable laser. Lasers fail, so having a pluggable laser helps with serviceability. Likewise,
Broadcom has a version of the Tomahawk 5 slide we have shown before in this presentation with the components of CPO (co-packaged optics.)
The Tomahawk 4 “Humboldt” was the first generation system.
That looks a lot like the OCP 2022 CPO demo for Broadcom.
This is the Tomahawk 4 implementation of Silicon Photonics.
The new version is Tomahawk 5 Bailly. This is a 51.2T Ethernet switch in the same class as the Marvell Teralynx 10 51.2T 64-port 800GbE Switch we recently showed.
In the new version, Broadcom has improved packaging. The FOWLP is used in things like mobile chips so it created a more stable platform for the optical interconnect.
Here are the steps for creating the chips with engines.
Here is the cross-section of the optical engine. Broadcom says that this is a more scalable way to manufacture and integrate the optical engines.
Broadcom showed this idea of having 128 ports of 400G optics on the table a few months ago versus the Bially. Those optical modules include light sources, so it would have been more accurate if Broadcom included the light sources.
Broadcom says that an 800G module will take 13-15W of power. With CPO, and removing things like DSP complexity, this is down to under 4.8W.
One of the significant challenges is not just getting optical attached but also operating without errors.
Here, Broadcom is saying that it is further optimizing the solution. These are steps to show vendors that the quality is up to what one gets with traditional pluggable optics.
In 51.2T switches, the optical network pluggable use a lot of power. Using co-packaged optics lowers the total power. As a result, one saves around 30% power.
The next step is using similar technology to combine compute ASICs with the co-packaged optics. Here we can see the CoWoS packaged with HBM, a compute ASIC, and the optical chiplet.
The GPU attach looks a bit more advanced than the chip above with more HBM and more compute tiles. Still the idea is that one can get 64 links off of the chip to connect directly to switches.
Moving the optics away from the xPU is also important because it moves optical engines further from the hot compute. This may not seem like a big deal, but it matters to ensure optics operate reliably.
Today it is a 64x 100G device and two devices per for 12.8T. Where this gets crazy is going from 12.8T of density to 102.4T. That is an almost crazy amount of bandwidth.
Broadcom is showing the use of bi-directional optics for high-radix networks. The transmit and receive are on different wavelengths.
That bi-directional optics approach lowers the cost of fiber which is why it is attractive.
There is a lot here.
Final Words
Optical attach must happen at some point. Intel showed its Silicon Photonics Connector in 2022, but it is still using copper interconnects on its chips, even though there were plans to replace HBM stacks with optical connections via Lightbender next year in 2025 (Intel stopped Lightbender.) Broadcom does a lot of optical networking just by virtue of its networking business and is shipping co-packaged optics switches now. Moving to optical I/O is going to be a big deal for the AI space as companies strive to build bigger packages.
This is easily one of the coolest presentations this year.