Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Hardware
Tom’s Hardware
Technology
Anton Shilov

Elon Musk plans to scale the xAI supercomputer to a million GPUs — currently at over 100,000 H100 GPUs and counting

Charles Liang of Supermicro and Elon Musk in gigafactory.

Elon Musk's AI company, xAI, is set to expand its Colossus supercomputer to over one million GPUs, reports the Financial Times. Thus, the expanded Colossus machine will be one of the most powerful supercomputers in the world. However, it will require significant investments, supply, and infrastructure availability.

Colossus, which is used to train the large language model behind Grok, already operates over 100,000 H100 processors from Nvidia and is set to double the number of GPUs shortly to become the largest supercomputer in a single building. The plan to increase the number of GPUs is underway, though this one is going to take a sizeable amount of time and effort. To accomplish the mission, xAI is working with Nvidia, Dell, and Supermicro. Furthermore, Memphis, Tennessee, where Colossus is located, has reportedly established a dedicated xAI operations team to aid the endeavor.

It is unclear whether xAI plans to use current-generation Hopper or next-generation Blackwell GPUs during the expansion. The Blackwell platform is expected to scale better than Hopper, so it makes more sense to use the upcoming technology instead of the current one. But in any case, getting the 800,000 – 900,000 AI GPUs is hard, as demand for Nvidia's products is overwhelming. Another challenge is to make 1,000,000 GPUs work in concert with maximum efficiency and, again, Blackwell would make more sense here.

The financial requirements of this expansion are colossal, of course. Acquiring GPUs — costing tens of thousands of dollars each — alongside infrastructure for power and cooling, could push investment into the tens of billions. xAI has raised $11 billion this year and recently secured another $5 billion. Currently, the company is valued at $45 billion.

Unlike rivals such as OpenAI, which partners with Microsoft for computing power, and Anthropic, supported by Amazon, xAI is independently building its supercomputing capacity. This strategy puts the company in a high-stakes race to secure advanced AI hardware, but given the scale of xAI's investments, this actually puts Musk's company ahead of its rivals.

Despite its rapid progress, xAI has faced criticism for allegedly bypassing planning permissions and the project's strain on the regional power grid. To address concerns the company has emphasized grid stability measures, including deploying Tesla's megapack technology to manage power demands.

While xAI's focus on hardware has earned acclaim, its commercial offerings remain limited. Grok reportedly lags behind leading models like ChatGPT and Google's Gemini in both sophistication and user base. However, investors view Colossus as a foundational achievement that demonstrates xAI's ability to rapidly deploy cutting-edge technology.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.