Meta announced it was bringing a new voice mode to its increasingly popular Meta AI chatbot during its Connect event last month. It is finally starting to roll out and from some initial testing, I’ve found it more engaging and natural than I expected.
Unlike OpenAI’s Advanced Voice in ChatGPT which is native speech-to-speech, MetaAI Voice first converts wha you say to text, responds in text then reads its response out loud. This works the same way as Google’s Gemini Live.
Meta AI also has a range of voices including celebrities like Dame Judi Dench, Awkwafina, Keegan-Michael Key and Kristen Bell. There are also an additional five system voices not cloned from that of a famous person that sound just as natural and engaging.
Despite being text-to-speech you can interrupt the AI and I found it handles being interrupted better than Microsoft’s new Copilot or Google Gemini Live.
How do I access Meta AI Voice?
Meta AI is available in most Meta products including WhatsApp, Instagram, Facebook and the Ray-Ban smart glasses. To access it you simply initiate a chat conversation with the artificial intelligence instead of a human contact.
It isn't globally available yet but access is being rolled out piecemeal. For example, the UK has access in WhatsApp but not to the web version at meta.ai. Access in the glasses is also sporadic with different features available in different countries.
Where you do have access the voice mode is available as an icon in the chat bar. It looks like a waveform with a series of smaller, larger then smaller again lines. Clicking it switches the view to a circle on the screen and the AI will begin talking in the voice you selected.
To change the voice click the 'info' icon in the top right corner of the voice mode, select Voice and pick from the list of nine options. In the settings view you can also see details on previous conversations and images you have sent to the AI.
When you use voice mode in ChatGPT or Copilot all you get is the voice, but as Meta AI is speech-to-text it displays the words it speaks on the screen for you to read along. This isn’t always perfect though. At one point I asked Judie Dench AI to rap and it said ‘wrap’, which I suspect the real Judi Dench would be better at than spitting bars over a beat.
How well does Meta AI voice work?
Meta AI Voice is somewhat more robotic than Advanced Voice or Copilot Voice. This is the result of having to first transcribe what you’re saying.
The biggest benefit of native speech, used in Copilot and Advanced Voice, is being able to change the voice in response to how you speak. Despite the slightly stilted limitations of the voice, the voices are genuinely impressive.
Celebrity voices sound a lot like the person they are mimicking. Even the non-celebrity voices sound more natural than I’ve heard from other models. I think Meta has the best voice of any AI tool. Searching between them gave me flashbacks of picking a ringtone in the 90s as a teenager.
After my testing, I felt like I should apologize to Dame Judi Dench after asking the AI to translate a sonnet into 'Gen Alpha Slang' then having the AI read it out loud in her voice.
Beyond the voice, it is like any other AI. It is closer in performance to Google Gemini Live and with the addition of access to your Meta (read Facebook, Instagram and WhatsApp) data, it has an added personal context that only Apple can match.
Despite the limitations I found it more responsive than the cleverer native voice models, never had it refuse to answer and was able to successfully interrupt every time. However, it would only recite snippets of a real work (poem, story) and its made-up stories were a paragraph or two at most — no long speeches.
Meta has done an exceptional job, not just with the Meta AI chatbot but also its voice model and my prediction is that a billion people could be regularly using it by the end of 2025.