According to Meta's chief scientist and deep learning pioneer Yann LeCun, it would probably take decades for the existing artificial intelligence (AI) systems to be able to feel and experience emotions and gain common sense.
In other words, he believes current AI systems aren't capable of going beyond summarising text in creative ways. This aligns with LeCun's statement at a symposium in Hong Kong last month, where leading tech experts, scholars and industry leaders shared their insights on AI.
In contrast, Nvidia CEO Jensen Huang claims AI will be "fairly competitive" with humans in less than 5 years. The top executive suggests AI will outperform people at a slew of mentally intensive tasks.
At a recently concluded media event held by Facebook parent Meta, LeCun noted that the Nvidia CEO is sparing no effort in a bid to cash in on the AI craze. "There is an AI war, and he's supplying the weapons," LeCun explained.
As expected, Nvidia is getting the benefits of its early investment in AI. Notably, the company's chips are now used to power large language models (LLMs) such as ChatGPT.
What is AGI and why are researchers trying to create it?
″[If] you think AGI is in, the more GPUs you have to buy," LeCun said, about tech giants trying to develop artificial general intelligence (AGI). To those unaware, AGI alludes to a form of artificial intelligence that can understand, learn and apply knowledge across various tasks and domains.
As long as OpenAI researchers and technologists at other firms continue their pursuit of AGI, there will be demand for Nvidia's computer chips. LeCun believes we are more likely to get "cat-level" or "dog-level" AI before human-level AI is achieved.
The tech industry will have to look beyond language models and text data if they want to develop the kinds of advanced human-like AI systems that researchers have restlessly been waiting to achieve.
Further noting that "text is a very poor source of information," LeCun indicated that it would take a whopping 20,000 years for a human to read the text used to train language models.
"Train a system on the equivalent of 20,000 years of reading material, and they still don't understand that if A is the same as B, then B is the same as A," he pointed out. LeCun also said this kind of training does not teach these models some basic things about the world.
In collaboration with Meta AI executives, LeCun has been trying to figure out how transformer models that play a vital role in creating apps like ChatGPT could be adjusted to work with a wide range of data including audio, image and video information.
As part of its research, Meta has developed software that can help people improve their tennis game while wearing the company's Project Aria augmented reality (AR) glasses, which merge digital graphics into the real world.
In a demo conducted by executives, a person wearing Meta's AR glasses while playing tennis was able to see visual cues that showed them how to swing their arms in perfect form and how to properly hold their tennis rackets.
This type of digital tennis assistant draws its power from AI models that require a combination of three-dimensional visual data as well as text and audio, allowing the digital assistant to speak.
The word on the street is that researchers are currently working on multimodal AI systems. However, according to a report by CNBC, their development carries a steep price tag.
Also, with more companies like Meta and Google parent Alphabet trying to develop advanced AI models, Nvidia is likely to gain an edge, especially if there's no other company to compete with it.
What has Nvidia gained from generative AI?
It is no secret that Nvidia is one of the biggest benefactors of generative AI. The company's steeply-priced graphics processing units were used to train massive language models.
For instance, Meta used 16,000 Nvidia A100 GPUs to train its Llama AI software. CNBC asked LeCun whether he thinks the tech industry will need more hardware providers as researchers continue taking major steps towards developing these kinds of sophisticated AI models.
"It doesn't require it, but it would be nice," LeCun said, further noting that GPU technology is still the best available option when it comes to AI. However, he believes computer chips may not be called GPUs in the future.
"What you're going to see hopefully emerging are new chips that are not graphical processing units, they are just neural, deep learning accelerators," LeCun said.