Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Euronews
Euronews
Pascale Davies

China’s DeepSeek finds a way to help AI get better at answering questions. Here’s how it works

Chinese AI start-up DeepSeek has introduced a new way to improve the reasoning capabilities of large language models (LLMs) to deliver better and faster results to general queries than its competitors. 

DeepSeek sparked a frenzy in January when it came onto the scene with R1, an artificial intelligence (AI) model and chatbot that the company claimed was cheaper and performed just as well as OpenAI's rival ChatGPT model. 

Collaborating with researchers from China’s Tsinghua University, DeepSeek said in its latest paper released on Friday that it had developed a technique for self-improving AI models. 

The underlying technology is called self-principled critique tuning (SPCT), which trains AI to develop its own rules for judging content and then uses those rules to provide detailed critiques. 

It gets better results by running several evaluations simultaneously rather than using larger models.

This approach is known as generative reward modeling (GRM), a machine learning system that checks and rates what AI models produce, making sure they match what humans ask with SPCT.

How does it work?

Usually, improving AI requires making models bigger during training, which takes a lot of human effort and computing power. Instead, DeepSeek has created a system with a built-in "judge" that evaluates the AI's answers in real-time.

When you ask a question, this judge compares the AI's planned response against both the AI's core rules and what a good answer should look like. 

If there's a close match, the AI gets positive feedback, which helps it improve.

DeepSeek calls this self-improving system "DeepSeek-GRM". The researchers said this would help models perform better than competitors like Google's Gemini, Meta's Llama, and OpenAI's GPT-4o.

DeepSeek plans to make these advanced AI models available as open-source software, but no timeline has been given. 

The paper’s release comes as rumours swirl that DeepSeek is set to unveil its latest R2 chatbot. But the company has not commented publicly on any such new release.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.