Cerebras Systems Unveils World’s Fastest AI Chip with Whopping 4 Trillion Transistors
Third Generation 5nm
Wafer Scale Engine (WSE-3) Powers Industry’s Most Scalable AI Supercomputers,
Up To 256 exaFLOPs via 2048 Nodes
Bangalore, India –
March 15, 2024 – Cerebras Systems, the
pioneer in accelerating generative AI, has doubled down on its existing world record of fastest AI chip with
the introduction of the Wafer Scale Engine 3. The WSE-3 delivers twice the
performance of the previous record-holder, the Cerebras WSE-2, at the same
power draw and for the same price. Purpose built for training the industry’s
largest AI models, the 5nm-based, 4 trillion transistor WSE-3 powers the Cerebras
CS-3 AI supercomputer, delivering 125 petaflops of peak AI performance through 900,000
AI optimized compute cores.
Key Specs:
o
4 trillion transistors
o
900,000 AI cores
o
125 petaflops of peak AI performance
o
44GB on-chip SRAM
o
5nm TSMC process
o
External memory: 1.5TB, 12TB, or 1.2PB
o
Trains AI models up to 24 trillion parameters
o
Cluster size of up to
2048 CS-3 systems
With a huge memory
system of up to 1.2 petabytes, the CS-3 is designed to train next generation
frontier models 10x larger than GPT-4 and Gemini. 24 trillion parameter models can
be stored in a single logical memory space without partitioning or refactoring,
dramatically simplifying training workflow and accelerating developer
productivity. Training a one-trillion parameter model on the CS-3 is as
straightforward as training a one billion parameter model on GPUs.
The CS-3 is built for both enterprise and hyperscale needs. Compact four
system configurations can fine tune 70B models in a day while at full scale
using 2048 systems, Llama 70B can be trained from scratch in a single day
– an unprecedented feat for generative AI.
The latest
Cerebras Software Framework provides native support for PyTorch 2.0 and the
latest AI models and techniques such as multi-modal models, vision
transformers, mixture of experts, and diffusion. Cerebras remains the only
platform that provides native hardware acceleration for dynamic and
unstructured sparsity, speeding up training by up to 8x.
“When we started on this journey eight years ago, everyone said wafer-scale processors were a pipe dream. We could not be more proud to be introducing the third-generation of our groundbreaking water scale AI chip,” said Andrew Feldman, CEO and co-founder of Cerebras.
“WSE-3 is the fastest AI
chip in the world, purpose-built for the latest cutting-edge AI work, from mixture
of experts to 24 trillion parameter models. We are thrilled for bring WSE-3 and
CS-3 to market to help solve today’s biggest AI challenges.”
Superior Power Efficiency
and Software Simplicity
With every component
optimized for AI work, CS-3 delivers more compute performance at less space and
less power than any other system. While GPUs power consumption is doubling
generation to generation, the CS-3 doubles performance but stays within the
same power envelope. the CS-3 offers superior ease of use, requiring 97% less
code than GPUs for LLMs and the ability to train models ranging from 1B to 24T
parameters in purely data parallel mode. A standard implementation of a GPT-3
sized model required just 565 lines of code on Cerebras – an industry
record.
Industry
Partnerships and Customer Momentum
Cerebras already has a
sizeable backlog of orders for CS-3 across enterprise, government and
international clouds.
“As a
long-time partner of Cerebras, we are excited to see what’s possible with the
evolution of wafer-scale engineering. CS-3 and the supercomputers based on this
architecture are powering novel scale systems that allow us to explore the
limits of frontier AI and science,” said Rick Stevens, Argonne National Laboratory Associate
Laboratory Director for Computing, Environment and Life Sciences. “The
audacity of what Cerebras is doing matches our ambition, and it matches how we
think about the future.”
“As part of our multi-year strategic collaboration with
Cerebras to develop AI models that improve patient outcomes and diagnoses, we are
excited to see advancements being made on the technology capabilities to
enhance our efforts,” said Dr. Matthew Callstrom, M.D., Mayo Clinic’s medical director for strategy and chair of radiology.
The CS-3 will also play
an important role in the pioneering strategic partnership between Cerebras and
G42. The Cerebras and G42 partnership has already delivered 8 exaFLOPs of AI
supercomputer performance via Condor Galaxy 1 (CG-1) and Condor Galaxy 2 (CG-2).
Both CG-1 and CG-2, deployed in California, are among the largest AI supercomputers
in the world.
Today, Cerebras and G42 announced that Condor Galaxy 3 is under construction. Condor Galaxy 3 will be built with 64 CS-3 systems, producing 8 exaFLOPs of AI compute, one of the largest AI supercomputers in the world. Condor Galaxy 3 is the third installation in the Condor Galaxy network. The Cerebras G42 strategic partnership is set to deliver tens of exaFLOPs of AI compute. Condor Galaxy has trained some of the industry’s leading open-source models, including Jais-30B, Med42, Crystal-Coder-7B and BTLM-3B-8K.
“Our strategic
partnership with Cerebras has been instrumental in propelling innovation at G42,
and will contribute to the acceleration of the AI revolution on a global scale,”
said Kiril Evtimov, Group CTO of G42. “Condor
Galaxy 3, our next AI supercomputer boasting 8 exaFLOPs, is currently under
construction and will soon bring our system’s total production of AI compute to
16 exaFLOPs.”
For more information,
please visit https://www.cerebras.net/product-system/.
About Cerebras Systems
Cerebras Systems is a team of pioneering computer architects, computer scientists, deep learning researchers, and engineers of all types. We have come together to accelerate generative AI by building from the ground up a new class of AI supercomputer. Our flagship product, the CS-3 system, is powered by the world’s largest and fastest AI processor, our Wafer-Scale Engine-3. CS-3s are quickly and easily clustered together to make the largest AI supercomputers in the world, and make placing models on the supercomputers dead simple by avoiding the complexity of distributed computing. Leading corporations, research institutions, and governments use Cerebras solutions for the development of pathbreaking proprietary models, and to train open-source models with millions of downloads. Cerebras solutions are available through the Cerebras Cloud and on premise. For further information, visit https://www.cerebras.net.