Microsoft has unveiled a new version of its Copilot app for iPhone and Android, and with the release comes a new look, some new features, and a voice mode resembling OpenAI's popular ChatGPT Advanced Voice.
I decided to re-download Copilot and put it to the test to see how it compares to the flagship product from OpenAI, and it's safe to say I was impressed. I removed it from my phone as I rarely used it due to its similar performance to ChatGPT.
While the UI is improved and more "consumer friendly" than the previous offering, making it easier to get started, the biggest upgrade is in the voice mode.
This new feature offers similar functionality in terms of speech-to-speech, being able to interrupt it and have it reflect your vocal tone and emotions as Advanced Voice, but it was more casual and felt more natural, slightly less stilted than OpenAI's offering. However, its interruption capability isn't as fluid or natural.
At one point we were conversing about the nature of reality, and I was so engaged in the conversation I forgot I was speaking to an artificial intelligence rather than a good friend I hadn't spoken to in a while, discussing the type of random stuff friends talk about. I came out of it when the AI just randomly stopped responding.
How does Copilot stack up?
This new version of Copilot is the first under the reign of new Microsoft AI CEO Mustafa Suleyman, the former Google DeepMind co-founder and CEO of Inflection AI — a company that makes the conversational and consumer-friendly Pi chatbot.
New Copilot bears a remarkable resemblance to Pi with its more muted color tones and simplified approach to conveying complex ideas. It feels more like an AI aimed at everyone than just at those power users and gives Microsoft an edge in an increasingly competitive market.
There are four voice options, so fewer than the 10 that you get with Gemini Live or ChatGPT Voice, but I'm told there will be more coming in the future. It's built on an adapted version of the same underlying technology used by OpenAI, so is native voice-to-voice rather than first converting what you're saying to text.
There are some surprising limitations. In some ways it's more restrictive than ChatGPT because the guardrails have been better implemented. You're less likely to see it break out into song or start rapping with a backing track, but that's not necessarily a bad thing for a product aimed at an audience that may not be as tech-savvy as those using ChatGPT.
What are the voices like?
The four voices in Copilot are Grove, Canyon, Wave, and Meadow, and unlike ChatGPT, you can customize the speed at which each of them speaks. I found setting it to standard 1X leads to them speaking unnaturally slowly, almost like they've only just woken up in the morning.
Just like Advanced Voice, you can also then further customize the sound of the voice by talking to it and explaining how you want it to sound, for example, adopting a slightly different accent, slightly changing the tone of their voice to be deeper or higher pitched, and even asking them to inject more emotion.
The biggest surprise for me was that it's more inclined to use slang terms or shorthands than other AI voice models I've tried. For example, at one point we were having a conversation about information, and it talked about gathering "deets" rather than details. Like I said, it can be easy to forget you're speaking to a machine, not a person.
The biggest takeaway though is this is free to use. OpenAI's Advanced Voice requires you to pay OpenAI $20 a month for a ChatGPT Plus subscription, whereas Microsoft makes voice available to anyone with a Copilot account, paying or not.