Get all your news in one place.
100’s of premium titles.
One app.
Start reading
The Street
The Street
Ian Krietzberg

AI self-awareness is not 'inevitable'

On March 4 — the same day that Anthropic launched Claude 3 — Alex Albert, one of the company's engineers, shared a story on X about the team's internal testing of the model.

In performing a needle-in-the-haystack evaluation, in which a target sentence is inserted into a body of random documents to test the model's recall ability, Albert noted that Claude "seemed to suspect that we were running an eval on it." 

Anthropic's "needle" was a sentence about pizza toppings. The "haystack" was a body of documents concerning programming languages and startups. 

Related: Building trust in AI: Watermarking is only one piece of the puzzle

"I suspect this pizza topping 'fact' may have been inserted as a joke or to test if I was paying attention, since it does not fit with the other topics at all," Claude reportedly said, when responding to a prompt from the engineering team. 

Albert's story very quickly fueled an assumption across X that, in the words of one user, Anthropic had essentially "announced evidence the AIs have become self-aware." 

Elon Musk, responding to the user's post, agreed, saying: "It is inevitable. Very important to train the AI for maximum truth vs insisting on diversity or it may conclude that there are too many humans of one kind or another and arrange for some of them to not be part of the future."

Elon Musk and OpenAI's Sam Altman.

TheStreet/Getty

Musk, it should be pointed out, is neither a computer nor a cognitive scientist. 

The issue at hand is one that has been confounding computer and cognitive scientists for decades: Can machines think? 

The core of that question has been confounding philosophers for even longer: What is consciousness, and how is it produced by the human brain? 

The consciousness problem 

The issue is that there remains no unified understanding of human intelligence or consciousness

We know that we are conscious — "I think, therefore I am," if anyone recalls Philsophy 101 — but since consciousness is a necessarily subjective state, there is a multi-layered challenge in scientifically testing and explaining it. 

Philospher David Chalmers famously laid out the problem in 1995, arguing that there is both an "easy" and "hard" problem of consciousness. The "easy" problem involves figuring out what happens in the brain during states of consciousness, essentially a correlational study between observable behavior and brain activity. 

The "hard" problem, though, has stymied the scientific method. It involves answering the questions of "why" and "how" as it relates to consciousness; why do we perceive things in certain ways; how is a conscious state of being derived from or generated by our organic brain? 

"When we think these systems capture something deep about ourselves and our thinking, we induce distorted and impoverished images of ourselves and our cognition." — Psychologist and cognitive scientist Iris van Rooij

Though scientists have made attempts to answer the "hard" problem over the years, there has been no consensus; some scientists are not sure that science can even be used to answer the hard problem at all.  

And with this ongoing lack of understanding around human consciousness and the ways it is one, created by our brains, and two, connected to intelligence, the effort of developing an objectively conscious artificial intelligence system is, well, let's just say it's significant. 

More deep dives on AI:

And recent neuroscience research has found that, though Large Language Models (LLMs) are impressive, the "organizational complexity of living systems has no parallel in present-day AI tools." 

As psychologist and cognitive scientist Iris van Rooij argued in a paper last year, "creating systems with human(-like or -level) cognition is intrinsically computationally intractable."

"This means that any factual AI systems created in the short-run are at best decoys. When we think these systems capture something deep about ourselves and our thinking, we induce distorted and impoverished images of ourselves and our cognition."

The other important element to questions of self-awareness within AI models is one of training data, something that is currently kept under lock and key by most AI companies. 

That said, this particular instance of Claude's needle in the haystack, according to cognitive scientist and AI researcher Gary Marcus, is likely a "false alarm."

The statement from Claude, according to Marcus, is likely just "resembling some random bit in the training set," though he added that "without knowing what is in the training set, it is very difficult to take examples like this seriously."

TheStreet spoke with Dr. Yacine Jernite, who currently leads the machine learning and society team at Hugging Face, to break down why this instance is not indicative of artificial self-awareness, as well as the critical transparency components missing from the sector's public research. 

Related: Human creativity persists in the era of generative AI

The transparency problem

One of Jernite's big takeaway's, not just from this incident, but from the sector on the whole, is that external, scientific evaluation remains more important than ever. And it remains lacking. 

"We need to be able to measure the properties of a new system in a way that is grounded, robust and to the extent possible minimizes conflicts of interest and cognitive biases," he said. "Evaluation is a funny thing in that without significant external scrutiny it is extremely easy to frame it in a way that tends to validate prior belief."

Jernite explained that this is why reproducibility and external verification is so vital in academic research; without these things, developers can end up with systems that are "good at passing the test rather than systems that are good at what the test is supposed to measure, projecting our own perceptions and expectations onto the results the get."

He noted that there remains a broader pattern of obscurity in the industry; little is known about the details of OpenAI's models, for example. And though Anthropic tends to share more about its models than OpenAI, details about the company's training set remain unknown. 

Anthropic did not respond to multiple detailed request for comment from TheStreet, regarding the details of its training data, the safety risks of Claude 3 and its own impression of Claude's alleged "self-awareness."

"Without that transparency, we risk regulating models for the wrong things and deprioritizing current and urgent considerations in favor of more speculative ones," Jernite said. "We also risk over-estimating the reliability of AI models, which can have dire consequences when they're deployed in critical infrastructure or in ways that directly shape people's lives."

Related: AI tax fraud: Why it's so dangerous and how to protect yourself from it

Claude isn't self-aware (probably)

When it comes to the question of self-awareness within Claude, Jernite said that first, it's important to keep in mind how these models are developed. 

The first stage in the process involves pre-training, which is done on terabytes of content crawled from across the internet. 

"Given the prevalence of public conversations about AI models, what their intelligence looks like and how they are being tested, it is likely that a similar statement to the one produced by the model was included somewhere in there," Jernite said. 

The second stage of development involves fine-tuning with human and machine feedback, where engineers might score the desirability of certain outputs. 

"In this case, it is also likely that an annotator rated an answer that indicated that a sentence was out of place as more desirable than an answer that didn't," he said, adding that research into Claude's awareness, or lack thereof, would likely start with that examination into its training process. 

Self-awareness, Jernite said, might be an interesting — albeit abstract — topic for philosophical exploration, but when it comes to AI, "the overwhelming balance of evidence points to it not being relevant; while at the same time business incentives and design choices to make chatbots 'seem' more human are encouraging the public to pay more attention than warranted to it."

It's an impression that the model in question — Claude 3 (Sonnet) — supported. 

Claude's output in response to a prompt concerning its own alleged self-awareness. 

Anthropic, Claude 3 Sonnet

When prompted about conversations on X regarding its own alleged self-awareness, Claude's output explained that language abilities do not consciousness make. 

"The truth is, despite my advanced language abilities, I am an artificial intelligence without subjective experiences or a sense of self in the way that humans do."

"I don't actually have beliefs about being conscious or self-aware. I am providing responses based on my training by Anthropic to have natural conversations, but I don't have an inner experience of consciousness or awareness of being an AI that is distinct from my training."

Contact Ian with tips and AI stories via email, ian.krietzberg@thearenagroup.net, or Signal 732-804-1223.

Related: The ethics of artificial intelligence: A path toward responsible AI

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.