Meta is making another huge leap in the fast-paced and lucrative world of artificial intelligence as it announces the release of Meta Motivo, an AI model that has the capacity to control movements of a human-like digital agent, which could potentially improve user experience in the Metaverse.
According to the company, the new Meta Motivo would be addressing body control problems that could often be seen in digital avatars. The new model would be able to give life to these avatars and have them behave in a more realistic manner.
There is simply no sign of stopping for Meta when it comes to its investments on these new popular technologies that are taking the tech world by storm. Meta's investments in AI, augmented reality and other "Metaverse technologies," have resulted in an increase in its capital expense forecast for 2024, which ran up to $37 billion to $40 billion, Reuters reported.
Aside from the billions of dollars the company invested on the technology, Meta actually released a good number of its AI models to be used by developers. The tech giant believes that if it would open the model to others, it could lead to the creation of better tools that would also be beneficial for the services that it offers to customers.
"We believe this research could pave the way for fully embodied agents in the Metaverse, leading to more lifelike NPCs, democratization of character animation, and new types of immersive experiences," said Meta in a statement.
Aside from Meta Motivo, there is another training model that Meta is introducing. It is called the Large Concept Model (LCM), which according to the tech giant, the new model intends to "decouple reasoning from language representation."
LCM is inspired by how humans can plan high-level thoughts to communicate.
"The LCM is a significant departure from a typical LLM. Rather than predicting the next token, the LCM is trained to predict the next concept or high-level idea, represented by a full sentence in a multimodal and multilingual embedding space," Meta stated.
Meta claims that the LCM outperforms or matches recent large language models (LLMs) in the pure generative task of summarization, offers strong zero-shot generalization to unseen languages, and is more computationally efficient as input context grows.
Meta also released Meta CLIP 1.2 to develop a high-performance vision-language encoder. This enables the company's models to learn efficiently and accurately, capturing the nuances of fine-grained mapping between image and language semantics.
"Large-scale, high-quality, and diverse datasets are essential for building foundation models that can learn about the world. Meta CLIP is our effort towards building such datasets and foundation models. To ensure a high-quality and safe vision-language encoder foundation model, we've developed algorithms to effectively curate and align data with human knowledge from vast data pools, enabling our models to learn efficiently and cover all possibilities," Meta said in a statement.
The company also shared a demo and code for Meta Video Seal, an open-source model work video watermarking that builds on the popular Meta Audio Seal work that it shared last year.
Meta shared these developments at Meta Fundamental AI Research (FAIR) event.
In September, Meta unveiled its first augmented reality (AR) glasses at Connect, the annual development conference of the company, on Wednesday and upgraded its own artificial intelligence (AI) chatbot with voice options of celebrities such as Dame Judi Dench and John Cena.