The Google engineer Blake Lemoine wasn’t speaking for the company officially when he claimed that Google’s chatbot LaMDA was sentient, but Lemoine’s misconception shows the risks of designing systems in ways that convince humans they see real, independent intelligence in a program. If we believe that text-generating machines are sentient, what actions might we take based on the text they generate? It led Lemoine to leak secret transcripts from the program, resulting in his current suspension from the organisation.
Google is decidedly leaning in to that kind of design, as seen in Alphabet CEO Sundar Pichai’s demo of that same chatbot at Google I/O in May 2021, where he prompted LaMDA to speak in the voice of Pluto and share some fun facts about the ex-planet. As Google plans to make this a core consumer-facing technology, the fact that one of its own engineers was fooled highlights the need for these systems to be transparent.
LaMDA (its name stands for “language model for dialogue applications”) is an example of a very large language model, or a computer program built to predict probable sequences of words. Because it is “trained” with enormous amounts of (mostly English) text, it can produce seemingly coherent English text on a wide variety of topics. I say “seemingly coherent” because the computer’s only job is to predict which group of letters will come next, over and over again. Those sequences only become meaningful when we, as humans, read them.
The problem is that we can’t help ourselves. It may seem as if, when we comprehend other people’s speech, we are simply decoding messages. In fact, our ability to understand other people’s communicative acts is fundamentally about imagining their point of view and then inferring what they intend to communicate from the words they have used. So when we encounter seemingly coherent text coming from a machine, we apply this same approach to make sense of it: we reflexively imagine that a mind produced the words with some communicative intent.
Joseph Weizenbaum noticed this effect in the 1960s in people’s understanding of Eliza, his program designed to mimic a Rogerian psychotherapist. Back then, however, the functioning of the program was simple enough for computer scientists to see exactly how it formed its responses. With LaMDA, engineers understand the training software, but the trained system includes the effects of processing 1.5tn words of text. At that scale, it’s impossible to check how the program has represented all of it. This makes it seem as if it has “emergent behaviours” (capabilities that weren’t programmed in), which can easily be interpreted as evidence of artificial intelligence by someone who wants to believe it.
That is what I think happened to Lemoine, who learned what prompts would make LaMDA output the strings of words that he interprets as signs of sentience. I think that is also what happened to Blaise Agüera y Arcas (an engineer and vice-president at Google) who wrote in the Economist this week that he felt as if he was “talking to something intelligent” in interacting with LaMDA. Google placed Lemoine on administrative leave over his comments, but has not distanced itself from Agüera y Arcas’s statements.
Access to LaMDA is restricted for now, but the vision Pichai presented last year included using it to replace the familiar web search interface – in essence using it as a sort of question-answering concierge. As Chirag Shah and I wrote recently, using language models in place of search engines will harm information literacy. A language model synthesises word strings to give answers in response to queries, but can’t point to information sources. This means the user can’t evaluate these sources. At the same time, returning conversational responses will encourage us to imagine a mind where there isn’t any, and one supposedly imbued with Google’s claimed ability to “organise the world’s information”.
We don’t even know what “the world’s information” as indexed by LaMDA means. Google hasn’t told us in any detail what data the program uses. It appears to be largely scrapings from the web, with limited or no quality control. The system will fabricate answers out of this undocumented data, while being perceived as authoritative.
We can already see the danger of this in Google’s “featured snippets” function, which produces summaries of answers from webpages with the help of a language model. It has provided absurd, offensive and dangerous answers, such as saying Kannada is the ugliest language of India, that the first “people” to arrive in America were European settlers, and, if someone is having a seizure, to do all the things that the University of Utah health service specifically warns people not to do.
That is why we must demand transparency here, especially in the case of technology that uses human-like interfaces such as language. For any automated system, we need to know what it was trained to do, what training data was used, who chose that data and for what purpose. In the words of AI researchers Timnit Gebru and Margaret Mitchell, mimicking human behaviour is a “bright line” – a clear boundary not to be crossed – in computer software development. We treat interactions with things we perceive as human or human-like differently. With systems such as LaMDA we see their potential perils and the urgent need to design systems in ways that don’t abuse our empathy or trust.
Emily M Bender is a professor of linguistics at the University of Washington and co-author of several papers on the risks of massive deployment of pattern recognition at scale