Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Fortune
Fortune
Sage Lazzaro

No knows what an 'AI agent' is—but there's a growing consensus they aren't living up to the hype

A phone displaying a title saying "manus" and "the general AI agent." (Credit: Lam Yik—Bloomberg via Getty Images)

Hello and welcome to Eye on AI. In today’s edition…Companies experimenting with AI agents say the tech falls short of expectations; Nvidia announces its new chips and positions itself for the post-DeepSeek landscape; Elon Musk and Nvidia join the Microsoft-Blackrock AI fund; AI spammers are “brute forcing” the internet; and Foxconn emerges as a key player in the global AI race. 

Hardly a day goes by without a tech company announcing a new AI “agent” it says will revolutionize workflows and unlock unprecedented efficiencies. But while the makers of these agents—companies like Salesforce, Amazon, Oracle, and tons of startups—are hyping them, some of their customers are growing skeptical that these tools can deliver, at least right now. 

“Many customers report a gap between marketing and reality,” reads a new report from CB Insights, which analyzes the main pain points surrounding these products. 

Throughout March, CB Insights surveyed over 40 customers of AI agents and found that they’re running into issues with reliability, integration, and security. Other recent headline events have highlighted some of the same issues. For instance, there was a surge of excitement over Manus, which was billed as the first fully autonomous “general agent” and lauded by some as another DeepSeek moment for China—until user tests revealed unreliable performance and questionable outputs. 

The idea of an AI tool that can autonomously and accurately orchestrate and complete complex tasks makes sense as a goal to strive for, and it’s possible it can be achieved. But the current reality is that customers are traversing uncertain waters, and the hype cycle and muddled use of the term “agent” is causing confusion about what users can actually expect. 

(Un)reliability is top-of-mind

DeepMind founder and CEO Demis Hassabis recently offered an insightful description of the reliability issues surrounding AI agents, comparing it to compounding interest. 

“If your AI model has a 1% error rate and you plan over 5,000 steps, that 1% compounds like compound interest,” he said this week at a Google event, according to Computer Weekly. He went on to describe how by the time those 5,000 steps have been worked through the possibility of the answer being correct is “random.”

For companies that need to deliver accurate information and serve their own customers, a random possibility of accuracy is not usually acceptable. CB Insights reported reliability as the top concern among customers using AI agents, with nearly half citing it as an issue. One customer described getting partially processed information and hallucinations from an AI agent it deployed, for example. 

Customers are also running into issues with integrating AI agents into their existing systems. A lack of interoperability has long caused headaches in the world of enterprise software, but with AI agents, integration is kind of the whole point. “It was a bit of a gamble that we were signing up for a product where they didn't have quite all the integrations that we wanted,” one customer told CB Insights. 

A new swath of security risks 

Security also tops the list of customer concerns, and for good reason. Having a technology connect to various systems that contain sensitive information and take action autonomously opens up huge risks. Gartner predicts that by 2028, 25% of enterprise breaches will be traced back to AI agent abuse from both internal and external and malicious actors.

“Without proper governance, AI agents can and will inadvertently expose sensitive data, make unauthorized decisions, or create compliance blind spots,” Dimitri Sirota, CEO of data intelligence and compliance company Big ID, told Eye on AI. 

He said the best way companies can experiment with AI agents safely is by avoiding products that aren’t transparent about how the AI agent makes decisions. Companies should also pilot AI agents in controlled environments so they can uncover risks and adjust as necessary before scaling. 

What even is an AI “agent”?

The market for AI agents is becoming saturated, especially in specific niches like customer support and coding. At the same time, “no one knows what the hell an AI agent is,” as TechCrunch bluntly put it in a story published last Friday, arguing that the term has become “diluted to the point of meaninglessness.”

Every company is defining “AI agent” a little differently. Some generally use the term to refer to fully autonomous AI systems that can execute tasks independently, while others use it to refer to systems that follow predefined workflows. Some offer yet other definitions. And some—such as OpenAI—seem to frequently change and contradict their own prior definitions. A lot of tools that were previously called “AI assistants” are now also being referred to as “agents.”

For IT leaders, this definitional chaos creates confusion and deployment headaches. Not only is it difficult to understand what the products do and how they work, but it’s also impossible to compare benchmarks and performance metrics. 

None of this is to say companies aren’t starting to see some benefits from AI agents. But it is a reminder that these are still very early days for this technology, and the hype is running well ahead of reality. 

And with that, here’s more AI news. 

Sage Lazzaro
sage.lazzaro@consultant.fortune.com
sagelazzaro.com

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.