The Esperanto ET-SoC-1 is kicking off Hot Chips 33 Day 2 with a different solution. The company effectively is taking a many-core RISC-V architecture designed for AI acceleration, and then building small chips that can work in parallel on single cards for AI inference. As with our other Hot Chips coverage, this is being done live.
Esperanto ET-SoC-1 1092 RISC-V AI Accelerator Solution at Hot Chips 33
Here is a quick summary of the Esperanto ET-SoC-1. It is a wall of text, so we are not going to just transcribe.
Esperanto’s approach is to use many smaller chips each with its own DRAM instead of a larger chip.
Since Esperanto has a lower power budget, the goal is to be able to hit many of these smaller chips into a single slot with a 120W power budget.
Esperanto can get better performance per Watt by running at lower clocks and power, but to get solid benchmark results, each runs at 20W.
The main core, of which there are 1088 of them, is the ET-Minion. These are basically small in-order cores with big vector/ tensor units and its own local L1 cache. Esperanto is using RISC-V here so it has a customized core with a general-purpose instruction set plus Esperanto’s vector/ tensor instructions. Esperanto says RISC-V also uses the lowest number of instruction gates to implement, so it is a smaller/ lower power core.
Eight of these cores form a neighborhood and have 32KB of shared cache.
32 of these cores share 4x 1MB banks of SRAM and have a mesh stop to the rest of the chip. In the “not inspired by the Hobbit” category these blocks are each called a Minion Shire.
The mesh network allows the Shires to talk to each other.
Here is what the block diagram looks like for the entire chip with its 34 Minion shires and other Shires like the PCIe, Memory, and I/O Shires.
The card has a PCIe Gen4 link along with using LPDDR4x.
The ET-SoC-1 is then put onto cards with multiple chips per card. Specifically, Esperanto is focusing on the six-chip configuration. More on that soon.
Specifically, six of these chips can fit on a Glacier Point V2 card. That means one can fit 6,558 RISC-V cores and 192GB of DRAM per card while running at around 120W.
We have coveredĀ Twin Lakes and Glacier Point V2/ Yosemite V2 before. Facebook has been on Yosemite V3 for some time.
Esperanto is focusing on hardware in this talk, but it also has a software stack. Quite a small amount of emphasis was put on this which is strange since NVIDIA CUDA is a big competitive advantage. Perhaps just because this is a hardware conference.
Here is a recommendation benchmark that Esperanto is showing better performance.
Here is a ResNet-50 benchmark where even the Habana Goya is included. Goya also looks very good here. Just as a quick note, when we did theĀ NVIDIA Tesla T4 AI inferencing GPU benchmarks and review a long time ago, we did not get 70W. Esperanto may be using TDP not actual results.
While the ET-Minion is an in-order core designed to effectively exploit the vector/ tensor units, there are out-of-order cores. These are more of the chip’s high-end cores for when those are required.
Here is the summary of stats of the TSMC 7nm chip.
Just a quick note, Esperanto is just getting silicon so the performance results are estimated.
Final Words
Overall, this is a very interesting solution. Esperanto has a modular architecture to scale. It is also interesting to see a RISC-V chip. RISC-V is generating a lot of buzz. It has taken time for that buzz to turn into product. As Arm has transitioned from the exciting new architecture to the more mature legacy architecture, RISC-V is turning into the next hot new thing.