Chinese researchers just built an open-source rival to…

Chinese researchers just built an open-source rival to ChatGPT in 2 months. Silicon Valley is freaked out.

The DeepSeek logo displayed on a smartphone screen.

China has released a cheap, open-source rival to OpenAI's ChatGPT, and it has some scientists excited and Silicon Valley worried.

DeepSeek, the Chinese artificial intelligence (AI) lab behind the innovation, unveiled its free large language model (LLM) DeepSeek-V3 in late December 2024 and claims it was built in two months for just $5.58 million — a fraction of the time and cost required by its Silicon Valley competitors.

Following hot on its heels is an even newer model called DeepSeek-R1, released Monday (Jan. 20). In third-party benchmark tests, DeepSeek-V3 matched the capabilities of OpenAI's GPT-4o and Anthropic's Claude Sonnet 3.5 while outperforming others, such as Meta's Llama 3.1 and Alibaba's Qwen2.5, in tasks that included problem-solving, coding and math.

Now, R1 has also surpassed ChatGPT's latest o1 model in many of the same tests. This impressive performance at a fraction of the cost of other models, its semi-open-source nature, and its training on significantly less graphics processing units (GPUs) has wowed AI experts and raised the specter of China's AI models surpassing their U.S. counterparts.

"We should take the developments out of China very, very seriously," Satya Nadella, the CEO of Microsoft, a strategic partner of OpenAI, said at the World Economic Forum in Davos, Switzerland, on Jan. 22..

AI systems learn using training data taken from human input, which enables them to generate output based on the probabilities of different patterns cropping up in that training dataset.

For large language models, these data are text. For instance, OpenAI's GPT-3.5, which was released in 2023, was trained on roughly 570GB of text data from the repository Common Crawl — which amounts to roughly 300 billion words — taken from books, online articles, Wikipedia and other webpages.

Reasoning models, such as R1 and o1, are an upgraded version of standard LLMs that use a method called "chain of thought" to backtrack and reevaluate their logic, which enables them to tackle more complex tasks with greater accuracy.

This has made reasoning models popular among scientists and engineers who are looking to integrate AI into their work.

But unlike ChatGPT's o1, DeepSeek is an "open-weight" model that (although its training data remains proprietary) enables users to peer inside and modify its algorithm. Just as important is its reduced price for users — 27 times less than o1.