OpenAI left everyone agog earlier this week with a new and improved version of ChatGPT, the AI chatbot that has quickly become an essential helper for 100 million users.
Known as GPT-4o (the “o” stands for “omni”), the upgraded bot promises to be the Siri you’ve always dreamed of. It can sing, emote, instantly translate languages, identify the world in live videos, talk to you (and itself), and much, much more.
At its core, the bot is a better, stronger, and faster variety of OpenAI’s flagship AI, GPT-4. Although it has the same size brain as its predecessor, it can do a lot more with it, particularly when it comes to its text, audio and vision capabilities. Best of all, it will be made available to both free and paying ChatGPT users in the coming weeks, with its audio and video features coming later down the line.
When it eventually arrives on these shores, these are the top things you should try first:
What can you do with GPT-4o?
Here are some of the main use cases for GPT-4o, including those showcased by OpenAI during its recent demo event and examples from AI power users online.
A more “human” voice assistant
Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN
— OpenAI (@OpenAI) May 13, 2024
Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx
ChatGPT can now function as a fully-fledged voice assistant. Think of it like Alexa, if it was able to hold a free-flowing conversation based on what it can see and hear. But that’s not all. OpenAI has also tried to make the bot more human by allowing it to express different emotions and even identify how you’re feeling. In demos, the bot sang a bedtime story, sensed sarcasm and emulated it, accurately guessed that it was talking to an OpenAI employee from the company logo on their hoodie, and knew a person was smiling in a live video.
Sam Altman says he's only had GPT-4o for a week, but he uses it to help him stay in the zone while working pic.twitter.com/FRcc8qzCti
— Tsarathustra (@tsarnick) May 14, 2024
So eager was it to please, that some netizens even joked that they were developing feelings for the female-voiced bot. In an interview, OpenAI boss Sam Altman suggested that the bot will always be on-hand to help out, much like Siri, further signalling that this is the next generation of digital assistants. Based on your predilection for AI, you’ll either be enamoured or totally creeped out by it.
Prep for a work interview using multiple ChatGPTs
ChatGPT users are already asking it for career tips and advice on everything from writing a cover letter to improving their CV. In the near future, you’ll be able to take your prep work a step further by prompting the bot to give you tips on what to wear to an interview based on your appearance. The AI did just that in a demo, advising an OpenAI staffer to run a hand through his scruffy hair and ditch the bucket hat ahead of the big meeting.
Furthermore, with OpenAI previewing an impressive video-and-voice-powered interaction between a pair of bots, you may be able to enact an entire interview between two GPTs to get you primed for the big day. Of course, it will probably take a bit of leg work to set up, including lengthy instructions for the two bots, but it may be worth it if it lands you that dream gig.
Host work meetings
It seems ChatGPT will soon start popping up in many more places, including Zoom calls with your colleagues. The bot can apparently keep track of a meeting, identify what each attendee is talking about, direct the convo, and summarise the whole thing at the end.
Its attention span will probably be governed by its context window, which refers to the amount of words it can digest when formulating a response. Notably, the demo for this perk was just two minutes, so it’s unclear how it will fare during lengthier meetings.
Now that OpenAI is launching a dedicated Mac app for ChatGPT, it could easily linger on your video calls. The digital assistant was also shown refereeing a rock, paper scissors game and could theoretically oversee other, simple pastimes as well. Just don’t tell it to watch over your kids as some social media users are advising.
Take an AI translator on your next holiday
Live audience request for GPT-4o vision capabilities pic.twitter.com/FPRXpZ2I9N
— OpenAI (@OpenAI) May 13, 2024
OpenAI wants to make translating languages more seamless than the tools currently at our disposal (think image-recognition powerhouse Google Lens).
In a slick demo, GPT-4o eavesdropped on a multilingual exchange between two people in Italian and English, translating as it went. As the speakers carefully enunciated their words, the whole thing felt a little staged, calling into question the bot’s real-life prowess.
For instance, will it be able to discern between different Italian dialects like Sicilian, Neapolitan, and Sardinian? Nevertheless, flashing it at a foreign menu that looks like gobbledegook could be a lifesaver.
Ask an AI tutor to help with homework
One of the biggest concerns that have surfaced since the overnight success of ChatGPT is its use in education. Some schools have even banned the bot for fear that students would use it to cheat on their homework.
So, not to ruffle any feathers, OpenAI seems to be focusing more on GPT-4o’s ability to act as a teaching aid rather than a wholesale essay writer. We were shown the bot helping a young lad with his geometry homework on an iPad, going back and forth using its vision and audio perks to guide him to a solution. In this segment, ChatGPT accurately interprets the boy’s drawings and responds with feedback in its typically positive demeanour.
The new feature may still alarm those who believe AI is out to take our jobs but, as a free learning tool, it will probably appeal to plenty of parents and children.
Web search, file uploads, data analysis
GPT-4o is truly remarkable on 18th handwriting. I gave it the following letter and asked it for a transcription. A couple of very minor errors…amazing! pic.twitter.com/3JevZvd5p5
— Generative History (@HistoryGPT) May 14, 2024
With the launch of GPT-4o, ChatGPT free users will finally get many of the premium features that were previously limited to monthly subscribers.
For instance, the bot will be able to fetch relevant, up-to-date information from the web using Microsoft’s Bing search engine. It will also have the power to turn your uploaded data into interactive tables and charts. More broadly, you’ll be able to instruct it to edit videos, images, documents and more. OpenAI says these advanced tools will arrive in the coming weeks.
Meanwhile, some users have already shared the new ways they’re using its skillset, including to decipher ancient handwritten letters.