Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Hardware
Tom’s Hardware
Technology
Mark Tyson

Elon Musk is doubling the world's largest AI GPU cluster — expanding Colossus GPU cluster to 200,000 'soon,' has floated 300,000 in the past

Four banks of xAI's HGX H100 server racks, holding eight servers each. .

Billionaire Elon Musk has taken to Twitter / X to boast that his remarkable xAI data center is set to double its firepower “soon.” He was commenting on the recent video exposé of his xAI Colossus AI supercomputer. In the highlighted video, TechTuber ServeTheHome was stunned when he saw the gleaming rows of Supermicro servers packed with 100,000 state-of-the-art Nvidia enterprise GPUs.

So, the xAI Colossus AI supercomputer is on course “Soon to become a 200k H100/H200 training cluster in a single building.” Its 100,000 GPU incarnation, which only just started AI training about two weeks ago, was already notable. While we think “soon” might indeed be soon in this case. However, Musk’s prior tech timing slippages (e.g., Tesla's full self-driving, Hyperloop delays, SolarCity struggles) mean we should be generally cautious about his forward-looking boasts.

The xAI Colossus has already been dubbed an engineering marvel. Importantly, praise for the supercomputer’s prowess isn’t limited to the usual Musk toadies. Nvidia CEO Jensen Huang also described this supercomputer project as a “superhuman” feat that had “never been done before.” xAI engineers must have worked very hard and long hours to set up the xAI Colossus AI supercomputer in 19 days. Typically, projects of this scale and complexity can take up to four years to get running, indicated Huang.

What will the 200,000 H100/H200 GPUs be used for? This very considerable computing resource will probably not be tasked with making scientific breakthroughs for the benefit of mankind. Instead, the 200,000 power-hungry GPUs are likely destined to train AI models and chatbots like Grok 3, ramping up the potency of its machine learning distilled ‘anti-woke’ retorts.

This isn’t the hardware endgame for xAI Collosus hardware expansion, far from it. Musk previously touted a Colossus packing 300,000 Nvidia H200 GPUs throbbing within.

At the current pace of upgrades, we could even see Musk Tweeting about reaching this 300,000 goal before 2024 is out. Perhaps, if anything delays ‘Grok 300,000,’ it could be factors outside of Musk’s control, like GPU supplies. We have also previously reported that on-site power generation had to be beefed up to cope even with stage 1 of xAI's Colossus, so that’s another hurdle - alongside complex liquid cooling and networking hardware.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.