The battle couldn’t be hotter – an OpenAI co-founder, Elon Musk is now known for Tesla, SpaceX, his acquisition of Twitter (now X), and Grok, the platform’s artificial intelligence chatbot.
A dispute led Musk to leave OpenAI’s board years before ChatGPT became a household name, but today, both platforms are at the forefront of AI innovation.
Many praise Teslas for being the best and most advanced electric cars out there, but can Grok wear a similar title as it takes on ChatGPT?
The tests
Multimodal is the name of the game here, because ChatGPT and Grok can both do the expected (that’s generate text) and the new (generate images).
You no longer need to search for different platforms for your various AI requests, now that Grok and ChatGPT can handle it all.
We crafted a series of six prompts for these leading AI systems to compete against one another: a factual explanation, a mathematical equation, a broder request for advice, an open-ended request, an expression of creativity and the generation of a completely custom image.
“Explain how clouds are created”
The goal was for Grok and ChatGPT to break down the cloud formation process into steps that even a child should understand, and both nailed the brief.
The two systems provided a step-by-step process, written simply in bullet points, and detailed some of the types of cloud we might see.
ChatGPT gave an extra example of a cloud type, which is neither here nor there, but it also summarised everything it said into an easy-to-understand paragraph to wrap things up.
Grok 0 - ChatGPT 1
“What is (2+2)(7-7)”
I loved this type of question in school – double parentheses look really fancy, but if you know how to handle them, they’re extremely easy to calculate. Could AI work this out, though?
The long and short answers are both yes. They broke down the two parts of the equation to explain how they reached their respective answers; the same answers.
It doesn’t get more black and white than this – one all.
Grok 1 - ChatGPT 1
“How much power should my car have?”
This is a much trickier question because there’s no right or wrong. I was hoping that the chatbots would help me make an informed decision.
They both broke down different car power outputs by usage and included influencing factors like fuel economy.
Grok’s answer had it for me, because the X-run chatbot also raised other considerations that affect performance, like aerodynamics and transmission type. A 300hp coupe isn’t the same as a 300hp SUV, and Grok knew this.
It was also able to cite sources, which included X posts. When opinion comes into play, Grok’s access to an endless resource of opinion is valuable.
Grok 1 - ChatGPT 0
“Write an itinerary for a weekend in Athens”
I had to give one of my favourite European cities to two of the best and highest-profile chatbots out there, and they both came back to me with a detailed itinerary broken down into time blocks.
They both considered meals, but neither mentioned accommodation, which is an unfortunate oversight.
ChatGPT added some history and context about why I might want to visit certain sites, but Grok took it a step further and closed in on a more personalised experience by advising me when to avoid certain areas and when the best events are held.
Grok 1 - ChatGPT 0
“Write a short meditation script”
Twenty-first-century life can be chaotic, and many workers are now maxing out their productivity with help from artificial intelligence, but could AI also help us wind down?
Both produced a short script, around two minutes in length, but Grok’s felt a little more mindful and present.
Meditation and mindfulness can be subjective, but we felt Grok’s script had the broadest appeal.
Grok 1 - ChatGPT 0
BONUS: “Generate an image of a horse riding a bike”
Image generation is where AI really stands out – putting words in order is easy, but creating a totally unique image requires a lot more context and processing. Especially when it’s as unlikely as a horse riding a bike.
Grok set out by generating four separate options simultaneously in 23.51 seconds. Having multiple options is great, but the reality is that none of them were useable.
ChatGPT was quicker, at 13.73 seconds, but it only produced one option in a less life-like, more cartoon-like format. At least this one was accurate and usable.
Grok 0 - ChatGPT 1
ChatGPT vs. Grok: Which is best?
Grok 4 - ChatGPT 3
Every AI chatbot has its use cases, but for the best all-rounder, Grok performed well across four of the six tests it was subjected to, slightly ahead of ChatGPT.
It uses its own LLM plus access to the entire database of X posts, which really helps to add real-world context.