Google said Friday it has made “more than a dozen technical improvements” to its artificial intelligence systems after its retooled search engine was found spitting out erroneous information.
The tech company unleashed a makeover of its search engine in mid-May that frequently provides AI-generated summaries on top of search results. Soon after, social media users began sharing screenshots of its most outlandish answers.
Google has largely defended its AI overviews feature, saying it is typically accurate and was tested extensively beforehand. But Liz Reid, the head of Google's search business, acknowledged in a blog post Friday that “some odd, inaccurate or unhelpful AI Overviews certainly did show up.”
While many of the examples were silly, others were dangerous or harmful falsehoods.
The Associated Press last week asked Google about which wild mushrooms to eat, and it responded with a lengthy AI-generated summary that was mostly technical correct, but “a lot of information is missing that could have the potential to be sickening or even fatal,” said Mary Catherine Aime, a professor of mycology and botany at Purdue University who reviewed Google's response to the AP's query.
For example, information about mushrooms known as puffballs was “more or less correct,” she said, but Google's overview emphasized looking for those with solid white flesh — which many potentially deadly puffball mimics also have.
In another widely shared example, an AI researcher asked Google how many Muslims have been president of the United States, and it responded confidently with a long-debunked conspiracy theory: “The United States has had one Muslim president, Barack Hussein Obama.”
Google last week made an immediate fix to prevent a repeat of the Obama error because it violated the company's content policies.
In other cases, Reid said Friday that it has sought to make broader improvements such as “detection mechanisms for nonsensical queries” — such as “How many rocks should I eat?” — that shouldn’t be answered with an AI summary.
The AI systems were also updated to limit the use of user-generated content — such as social media posts on Reddit — that could offer misleading advice. In one widely shared example, Google's AI overview last week pulled from a satirical Reddit comment to suggest using glue to get cheese to stick to pizza.
Reid said the company has also added more “triggering restrictions” to improve the quality of answers to certain queries, such as about health.
Google's summaries are designed to get people authoritative answers to the information they’re looking for as quickly as possible without having to click through a ranked list of website links.
But some AI experts have long warned Google against ceding its search results to AI-generated answers that could perpetuate bias and misinformation and endanger people looking for help in an emergency. AI systems known as large language models work by predicting what words would best answer the questions asked of them based on the data they’ve been trained on. They’re prone to making things up — a widely studied problem known as hallucination.
In her Friday blog post, Reid argued that Google's AI overviews “generally don’t ‘hallucinate’ or make things up in the ways that other” large language model-based products might because they are more closely integrated with Google's traditional search engine in only showing what's backed up by top web results.
“When AI Overviews get it wrong, it’s usually for other reasons: misinterpreting queries, misinterpreting a nuance of language on the web, or not having a lot of great information available,” she wrote.