Hungry for AI? New supercomputer accommodates 16 dinner-plate-size chips


The Cerebras Andromeda, a 13.5 million core AI supercomputer
Enlarge / The Cerebras Andromeda, a 13.5 million core AI supercomputer.

On Monday, Cerebras Methods unveiled its 13.5 million core Andromeda AI supercomputer for deep studying, experiences Reuters. In accordance Cerebras, Andromeda delivers over one 1 exaflop (1 quintillion operations per second) of AI computational energy at 16-bit half precision.

The Andromeda is itself a cluster of 16 Cerebras C-2 computer systems linked collectively. Every CS-2 accommodates one Wafer Scale Engine chip (usually known as “WSE-2”), which is presently the biggest silicon chip ever made, at about 8.5-inches sq. and filled with 2.6 trillion transistors organized into 850,000 cores.

Cerebras constructed Andromeda at a knowledge heart in Santa Clara, California, for $35 million. It is tuned for purposes like giant language fashions and has already been in use for tutorial and business work. “Andromeda delivers near-perfect scaling through easy information parallelism throughout GPT-class giant language fashions, together with GPT-3, GPT-J and GPT-NeoX,” writes Cerebras in a press launch.

The Cerebras WSL2 chip is roughly 8.5-inches square and packs 2.6 trillion transistors.
Enlarge / The Cerebras WSL2 chip is roughly 8.5-inches sq. and packs 2.6 trillion transistors.


The phrase “Close to-perfect scaling” signifies that as Cerebras provides extra CS-2 pc items to Andromeda, coaching time on neural networks is lowered in “close to excellent proportion,” in response to Cerebras. Usually, to scale up a deep-learning mannequin by including extra compute energy utilizing GPU-based programs, one would possibly see diminishing returns as {hardware} prices rise. Additional, Cerebras claims that its supercomputer can carry out duties that GPU-based programs can not:

GPU unimaginable work was demonstrated by certainly one of Andromeda’s first customers, who achieved close to excellent scaling on GPT-J at 2.5 billion and 25 billion parameters with lengthy sequence lengths—MSL of 10,240. The customers tried to do the identical work on Polaris, a 2,000 Nvidia A100 cluster, and the GPUs have been unable to do the work due to GPU reminiscence and reminiscence bandwidth limitations.”

Whether or not these claims maintain as much as exterior scrutiny is but to be seen, however in an period the place corporations usually prepare deep-learning fashions on more and more giant clusters of Nvidia GPUs, Cerebras seems to offer an alternate strategy.

How does Andromeda stack up in opposition to different supercomputers? Presently, the world’s quickest, Frontier, resides at Oak Ridge Nationwide Labs and might carry out at 1.103 exaflops at 64-bit double precision. That pc value $600 million to construct.

Entry to Andromeda is out there now to be used by a number of customers remotely. It is already being utilized by business writing assistant JasperAI and Argonne Nationwide Laboratory, and the College of Cambridge for analysis.

Source link