Get all your news in one place.

100’s of premium titles.
One app.

Get all your news in one place.

100’s of premium titles. One news app.

Windows Central

Kevin Okemwa

OpenAI's new "Deep Research" blows ChatGPT o3-mini and DeepSeek out of the water with 26.6% accuracy in the world's hardest "AI exam" — but it skipped the line

Deep Research DeepSeek OpenAI

The logos of OpenAI and DeepSeek artificial intelligence apps on mobile phones.

On Sunday, OpenAI unveiled Deep Research — an agentic AI tool that can conduct multi-step research on the internet for complex tasks. The ChatGPT maker says the tool can simulate a human research analyst and claims what the agent accomplishes in ten minutes would take several hours for a human equivalent.

And as it now seems, the tool is living up to the hype. According to shared benchmarks on debatably the hardest AI exam, Humanity's Last Exam, which was released less than two weeks ago, Deep Research holds a significant lead ahead of ChatGPT03-mini and DeepSeek's R1 V3-powered model (via TechRadar).

For context, the AI exam was created by some of the smartest experts across the world and features some of the most complex questions. DeepSeek previously held a significant lead against other proprietary models with a 9.4% accuracy score.

However, the Chinese AI model was dethroned from the top spot following the launch of OpenAI's o3-mini model with a 10.5% accuracy score. Things got a tad interesting when the setting was adjusted to o3-mini-high, pushing the accuracy score to 13%. The difference between both settings is attributed to the fact that the latter takes longer to analyze and reason when presented with a complex query.

On the other hand, OpenAI's new Deep Research agentic AI tool scored 26.6% in Humanity's Last Exam, translating to a 183% increase in result accuracy.

Granted, the tool ships with resourceful search capabilities, which allows it to scour the web for answers to some of the general knowledge questions featured in the complex test. Ultimately giving it a competitive advantage over other models in the running.

An OpenAI employee referred to his user experience with Deep Research as "a personal AGI moment," indicating:

"Using Deep Research has been a personal AGI moment for me. It takes 10 mins to generate accurate and thorough competitive and market research (with sources) that previously used to take me 3 hours."

Sign up to read this article

Read news from 100’s of titles, curated specifically for you.

Already a member? Sign in here

Top stories on inkl right now

Greg Biffle, NASCAR Winner in Multiple Series, Dies in Plane Crash at 55

Greg Biffle, NASCAR Winner in Multiple Series, Dies in Plane Crash at 55

He won titles in the the Craftsman Truck and Nationwide Series.

Sports Illustrated

Raúl Rocha, Owner of Miss Universe, Wanted by Mexican Authorities After Losing Cooperation Deal

Raúl Rocha, Owner of Miss Universe, Wanted by Mexican Authorities After Losing Cooperation Deal

Mexican federal prosecutors are seeking the arrest of Raúl Rocha Cantú, the owner of the Miss Universe Organization, after revoking a cooperation agreement that had temporarily shielded him from prosecution.

Denmark says Russia was behind two ‘destructive and disruptive’ cyber-attacks

Denmark says Russia was behind two ‘destructive and disruptive’ cyber-attacks

Intelligence service says attacks were work of groups connected to Russian state in ‘clear evidence’ of hybrid war

The Guardian - UK

NASCAR legend Greg Biffle and family killed in plane crash

NASCAR legend Greg Biffle and family killed in plane crash

Biffle was named one of the 75 greatest drivers in 2023

One subscription that gives you access to news from hundreds of sites

Already a member? Sign in here

Trump’s $1,776 ‘warrior dividend’ repurposed from military housing aid

Trump’s $1,776 ‘warrior dividend’ repurposed from military housing aid

President claims bonus the result of tariff revenue but stipend had already been approved in tax-and-spend bill

The Guardian - US

Democrats won’t release 2024 election loss ‘autopsy’, DNC chair says

Democrats won’t release 2024 election loss ‘autopsy’, DNC chair says

Report on Kamala Harris’s loss to Trump would be a ‘distraction’ as party is ‘putting our learnings into motion’

The Guardian - US

Related Stories

Top stories on inkl right now

Greg Biffle, NASCAR Winner in Multiple Series, Dies in Plane Crash at 55

Greg Biffle, NASCAR Winner in Multiple Series, Dies in Plane Crash at 55

He won titles in the the Craftsman Truck and Nationwide Series.

Sports Illustrated

Raúl Rocha, Owner of Miss Universe, Wanted by Mexican Authorities After Losing Cooperation Deal

Raúl Rocha, Owner of Miss Universe, Wanted by Mexican Authorities After Losing Cooperation Deal

Mexican federal prosecutors are seeking the arrest of Raúl Rocha Cantú, the owner of the Miss Universe Organization, after revoking a cooperation agreement that had temporarily shielded him from prosecution.

Denmark says Russia was behind two ‘destructive and disruptive’ cyber-attacks

Denmark says Russia was behind two ‘destructive and disruptive’ cyber-attacks

Intelligence service says attacks were work of groups connected to Russian state in ‘clear evidence’ of hybrid war

The Guardian - UK

NASCAR legend Greg Biffle and family killed in plane crash

NASCAR legend Greg Biffle and family killed in plane crash

Biffle was named one of the 75 greatest drivers in 2023

One subscription that gives you access to news from hundreds of sites

Already a member? Sign in here

Trump’s $1,776 ‘warrior dividend’ repurposed from military housing aid

Trump’s $1,776 ‘warrior dividend’ repurposed from military housing aid

President claims bonus the result of tariff revenue but stipend had already been approved in tax-and-spend bill

The Guardian - US

Democrats won’t release 2024 election loss ‘autopsy’, DNC chair says

Democrats won’t release 2024 election loss ‘autopsy’, DNC chair says

Report on Kamala Harris’s loss to Trump would be a ‘distraction’ as party is ‘putting our learnings into motion’

The Guardian - US

Our Picks

Seth Meyers Says Cut Jack Black SNL Sketch Is Still ‘Beloved Among The Cast.’ So, What Happened?

This could have been amazing!

‘Spit it out’: Florida chef exposes the shocking ‘epidemic’ of chicken and how you can avoid chicken tasting like wood

That's so gross.

We Got This Covered

NY Times columnist who wrote ‘count me out’ Epstein story last month featured in latest Epstein photo dump

‘Mr. Brooks had no contact with him before or after this single attendance at a widely-attended dinner,’ the New York Times said in a statement on Thursday.

The Independent UK

Blake Lively Reportedly Sent a Private Birthday Message to Taylor Swift, Debunking Bad Blood Rumours

Blake Lively is said to have sent Taylor Swift a private birthday message, quietly dispelling 'bad blood' rumours while signalling a more distant phase in their friendship.

International Business Times UK

Bianca Censori Uses Masked Spokesperson to Address Rumours of Feeling 'Trapped' in Kanye West Marriage

Bianca Censori addresses claims of feeling 'trapped' in her marriage to Kanye West in a bizarre interview, using a masked stand-in to share her thoughts on public perception and self-expression.

International Business Times UK

“Winner Of The Worst Photo Edit”: 20 Times Fans Noticed Bizarre Details In Celebrity Photos

These bizarre details in celebrity photos left netizens puzzled.

Fourteen days free

Download the app

One app. One membership.
100+ trusted global sources.

Download on the AppStore

Get it on Google Play