Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Evening Standard
Evening Standard
Technology
Andrew Williams

OpenAI accuses DeepSeek of using its work

DeepSeek operates in a similar way to OpenAI’s ChatGPT - (Andy Wong / AP)

AI giant OpenAI claims it has evidence DeepSeek used its work to develop its seemingly revolutionary DeepSeek AI models released earlier this month.

According to the FT, OpenAI says DeepSeek used a process called distillation to develop its models. This is where the answers or outputs of a larger or more complex AI model are used to train a newer one.

Microsoft, which is a major investor in OpenAI, and uses the company's intelligence for its CoPilot AI features, is investigating the situation, according to Bloomberg.

It says Microsoft’s security team has identified a group with links to DeepSeek that ran large amounts of data through OpenAI's APIs.

Such data transfer would happen with any company using OpenAI to power its own intelligence software. However, distillation is against OpenAI's terms of service.

OpenAI is estimated to have spent $80m to $100m (£64m to £80m) developing its GPT-4 AI model, while DeepSeek’s white paper on the development of its latest models provides a lower estimate of costs of just $6m (£4.8m).

This is based on the market rate for the compute time needed for the training, not the full cost to the business, but demonstrates the benefits of DeepSeek’s approach regardless.

Tech investor David Sacks, recently appointed as Donald Trump’s AI and crypto czar, told Fox News he believes “it’s possible” IP theft was involved in DeepSeek’s development. He says there is “substantial evidence” distillation was used.

OpenAI has been sued by multiple newspapers, book publishers, music rights organisations, authors and other media bodies across the world for the way its AI steals human-made content in the training process.

OpenAI has previously admitted its generative AI tools could not exist without the use of copyright materials, for which it has not paid.

"It would be impossible to train today's leading AI models without using copyrighted materials,” a statement submitted to the House of Lords read.

“Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today's citizens.”

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.