Investors on Monday finally woke up to the success of a relatively small Chinese player in the AI world. Spooked by a model reportedly trained on a miniscule budget—and from a country that was supposedly behind the U.S. in AI capabilities—investors dumped shares across the entire tech sector, including those of Nvidia, ASML, and Alphabet.
Last week, DeepSeek unveiled its R1 model, which, the startup claims, meets, if not exceeds, performance from OpenAI’s o1 model released last year. (o1 is designed to tackle reasoning and math problems.) DeepSeek has introduced its models to the public, and reviewers are impressed with their ability to handle tasks like coding and reasoning.
The R1 unveiling follows a December announcement from the Chinese startup that its large-language model, V3, was trained using just $5.6 million worth of computing power, far less than the $100 million–plus y OpenAI reportedly spent to train its GPT-4. DeepSeek said its V3 model matched the performance of OpenAI and Anthropic’s models on leading benchmarks, and Andrej Karpathy, who worked on AI for Tesla and OpenAI, praised DeepSeek’s ability to train its leading-edge AI on a “joke of a budget.”
Then, on Monday, DeepSeek released another model: The image-generating Janus Pro, which the company claims performs better than rival Dall-E from OpenAI and another from Stable Diffusion.
These announcements have put DeepSeek at the forefront of China’s already-crowded AI sector. DeepSeek’s 40-year-old founder, Liang Wenfeng, met with Chinese Premier Li Qiang on Jan. 20, joining a group of tech industry leaders, according to the South China Morning Post.
DeepSeek’s success shows that China’s tech sector has momentum in the global race to build AI, despit being denied access to top U.S.-made chips from Nvidia after President Biden banned their export to China.
Investors, too, are now wary of claims that massive capital expenditure on chips and data centers are needed to fuel the AI boom. Nvidia, the leading maker of AI chips, alone lost over $600 billion in value on Monday, following an almost 17% drop in its shares. The stock of energy companies like Constellation Energy, which are banking on a surge in electricity demand from power-hungry data centers, also plunged by double digits.
In contrast, Chinese tech shares rose on Monday amid greater optimism about the country’s AI ability. Shares of Baidu and Kuaishou, two Chinese Big Tech companies investing in AI, rose almost 4%. Meanwhile, Chinese AI firm SenseTime rose 7.2%. (As a private company, retail investors can’t buy shares in DeepSeek.)
But not every Chinese firm emerged unscathed. Semiconductor Manufacturing International Corporation (SMIC), a Chinese chipmaker reportedly working with Chinese tech giant Huawei on its chips, dropped 7.6% in Hong Kong trading on Monday.
Where did DeepSeek come from?
DeepSeek, based in Hangzhou, was born from HighFlyer, a Chinese quantitative hedge fund with 10 billion yuan ($1.4 billion today) in assets under management as of 2019, according to the South China Morning Post. The fund spun off DeepSeek in 2023, setting it up as an AI startup to develop models and build AI products.
DeepSeek founder Liang studied AI at Zhejiang University before cofounding HighFlyer, whose deep pockets allowed it to snap up thousands of Nvidia AI chips ahead of U.S. restrictions in 2022. That gave DeepSeek a leg up over many of its smaller competitors, enabling it to keep working on and training models as other Chinese AI startups scrambled to find processing power.
AI experts differ on how well DeepSeek performs against OpenAI’s ChatGPT and Anthropic’s Claude, suggesting the Chinese AI model meets benchmarks under specific hardware configurations, while struggling in other scenarios.
But DeepSeek is focused on doing what has been achieved by OpenAI and Anthropic more efficiently and at a lower cost. The startup’s models use various innovations to avoid the limits of its “mixture of experts” model, where different parts of the AI are trained to tackle specific kinds of questions.
The savings that creates might be enough for customers who are otherwise priced out of expensive U.S.-developed models, particularly those now limited from accessing vast reserves of American computing power.
Another difference is that DeepSeek’s model is open-source, allowing it to be used on different kinds of hardware. And, more importantly, the model shows users how it got its answer, which allows users get a sense of how it works, unlike OpenAI’s o1.
DeepSeek’s web and app interface also complies with China’s rules on internet censorship of sensitive topics like the 1989 student protests in Tiananmen Square. However, locally-hosted versions of DeepSeek’s work, enabled by DeepSeek’s embrace of open source models and running on a user’s own hardware, can respond to questions that are normally blocked in China–though with some inconsistency, reports Fortune’s David Meyer.
Why is the U.S. worried about China’s AI?
If U.S. officials had their way, a Chinese company would not have been able to produce a leading-edge AI model.
The U.S. has controlled sales of advanced AI chips to China since 2022, preventing Chinese companies from accessing processors needed to train leading-edge AI models. Chipmakers like Nvidia and Intel have tried to create processors for the Chinese market that comply with U.S. requirements, only for Washington to tighten the rules further.
That leaves Chinese AI companies with few options: Rely on U.S.-made chips imported before bans came into effect; tap gray-market smuggling rings that can ship chips from third countries; rely on data centers outside China; or turn to Chinese-made alternatives from companies like Huawei. (Huawei claims its AI chips outperform Nvidia’s A100 processor, but the Chinese tech giant has reportedly struggled with making the chips reliably at scale).
“Money has never been the problem for us; bans on shipments of advanced chips are the problem,” Liang told Chinese outlet 36Kr last year (translated by the ChinaTalk newsletter in November).
In mid-2022, months before the Biden administration imposed its export controls, HighFlyer revealed that it had snapped up 10,000 of Nvidia’s A100 processors. Its DeepSeek V3 paper, released in December, also claims the model was trained on Nvidia H800 processors, a version of the H100 processor that complied with the U.S.’s original export controls. (Both the A100 and H800 are now banned under the current export control regime).
Nvidia in a statement on Monday argued that DeepSeek’s work shows what can be possible using “fully export control compliant” computing power,
David Sacks, U.S. President Donald Trump’s “AI czar,” posted Monday that “DeepSeek R1 shows that the AI race will be very competitive and that President Trump was right to rescind the Biden [executive order], which hamstrung American AI companies without asking whether China would do the same.”
In addition to the chip export ban, the Biden administration also banned U.S. investment in Chinese AI.
Still, China has fostered a vibrant and diverse AI sector. Big Tech firms like Baidu, Alibaba, and ByteDance are developing their own foundational models and offering new AI services to companies and ordinary users. Chinese AI startups like MiniMax and Moonshot AI have released consumer-focused services that have even had some success in the U.S. market. (ByteDance also updated its AI model Doubao last week, offering AI at prices even lower than DeepSeek’s)
Yet China’s AI sector is crowded, which means vendors are locked in a price war to push out their competitors. Throughout 2024, companies like Alibaba and ByteDance slashed prices by as much as 90% to promote their models over the competition.
China’s success in AI is unnerving the U.S., implying that the broad measures taken to protect U.S. leadership in AI aren’t working. “I thought the restrictions we placed on chips would keep them back,” former Google CEO Eric Schmidt said last November at Harvard’s Kennedy School.
The developer behind ChatGPT is already ringing the alarm bells about China. Last week, OpenAI claimed in a policy paper that there was an “estimated $175 billion in global funds awaiting investment in AI projects.”
“If the U.S. doesn’t attract those funds, they will flow to China-backed projects—strengthening the Chinese Communist Party’s global influence,” OpenAI said.
OpenAI hopes it’ll soon get to tap some of that money. On Jan. 21, OpenAI CEO Sam Altman, SoftBank CEO Masayoshi Son, and Oracle cofounder Larry Ellison announced the Stargate Project, a venture that pledges to invest $500 billion in AI infrastructure across the U.S. over the next four years.
That $500 billion price tag isn’t looking as attractive anymore.
Update, Jan. 27, 2025: This article has been updated to account for Monday’s stock market plunge in tech stocks, as well as additional comments on DeepSeek.