TACC Frontera Launched as Fastest Academic Supercomputer

3
TACC Frontera Supercomputer Dell EMC And Intel
TACC Frontera Supercomputer Dell EMC And Intel

Today TACC Frontera was officially unveiled. If this sounds a bit strange, there is a good reason. Frontera already placed at #5 on the Top500 list in June 2019. We also heard about Frontera’s liquid-cooled design in our piece Dell EMC Talks Deep Learning and AI Q3 2019. Still, when you spend a lot of money on a supercomputer, you want to get all of the PR possible.

A Look Behind TACC Frontera

Instead of regurgitating a press release, we wanted to show what is powering the system since that is more interesting.

TACC Frontera Supercomputer Design
TACC Frontera Supercomputer Design

Starting with storage, the primary storage is a disk backed capacity system while there is also a faster NVMe scratch system. A 4PB “fast scratch” NVMe-backed system is rated at 1.5TB/s. For disk, there is 50PB rated at 300GB/s.

Interconnect is Mellanox Infiniband HDR-100 for its primary compute and its liquid submerged GPU systems. The Longhorn solution using IBM Power9 nodes and NVIDIA Tesla V100 GPUs utilizes Mellanox EDR Infiniband.

In terms of primary compute, things here are fascinating. There are 8008 dual-socket Intel Xeon Platinum 8280 nodes each with 192GB per node. That seems to indicate using 6x 16GB DIMMs per socket. Using liquid-cooled Dell nodes, one would have thought this is the perfect installation candidate for Intel’s Xeon Platinum 9200 series. Low memory capacities and high core count with liquid cooling for HPC are exactly what the Platinum 9200 series is designed for.

The use of Intel Xeon Platinum 8280’s is interesting for another reason. If you look at the new systems on the Top500 list, 20 cores are the most common by far, meaning the 28 core parts have 40% more cores per socket than we normally see. Here is the  Top500 November 2018 Our New Systems Analysis CPU cores per socket:

Nov 2018 Top500 New Systems Cores Per Socket
Nov 2018 Top500 New Systems Cores Per Socket

Here is the Top500 June 2019 Our New Systems Analysis, where Frontera is one of the 28-core machines:

June 2019 Top500 New Systems By CPU Core Count
June 2019 Top500 New Systems By CPU Core Count

The TACC systems tend to be ones that Dell and Intel win. Our sense is that Intel is providing significant discounts on the Platinum 8280 to win this. At list price, the Platinum 8280’s would cost over $160 million alone.

Beyond the compute nodes, there is a subsystem focused on single-precision that is focused on AI and work that does not require double precision. This subsystem is headlined by 360 NVIDIA Quadro RTX 5000 GPUs. The cooling is perhaps the most unique as they are using liquid submersion cooling here. That is likely a pilot for what is to come in future generations.

Final Words

Frontera is a cool system, but in the next few years, it will be absolutely dwarfed as we enter the exascale era. Still, systems doing research today are important. Frontera is clearly researching technologies and design principles for future generations of supercomputers.

3 COMMENTS

  1. They must have already locked into Intel or their software is Intel specific becauae they would have gotten a greater compute/power density with EPYCs…?

  2. Considering early access in march 2019 this is probably a machine ordered Q3 2018 at the latest unless they really pushed it and ordered it early Q4 if they had building and power ready.
    So 9200 wasn’t released and EPYC was gen1

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.