Begun, the chatbot wars have. Microsoft was early out the gate with its updated version of Bing, appending chatbot functionality to its search engine and integrating both into the Edge browser, while Google trailed behind, only just recently making its Bard chatbot available to the public.
Both companies have big plans for generative AI (the catchall name for AI that produce images, text, and video), integrating features into productivity software like Word, Excel, Gmail, and Docs, and pitching their respective chatbots as search engine companions, if not someday replacements.
Now that Bing and Bard are available for anyone to try (waitlist notwithstanding in Bard’s case), Inverse put the chatbots in a head-to-head test to get a sense of their usefulness.
Bing
Microsoft’s chatbot uses OpenAI’s large language model (LLM) to generate responses, part of a multi-billion dollar investment it made in the company. Bing, in particular, was recently revealed to use GPT-4, OpenAI’s latest iteration of its LLM and the most advanced to date. You can read the company's various claims on its websites about how smart GPT-4 is or early coverage of the new Bing for some of its more unusual responses, but the point is Bing is supposed to provide a robust and “creative” array of responses to a variety of requests, with sources cited, and the level of chattiness/creativity dialed into a level of your choosing.
Bard
Google’s chatbot is powered by its Language Model for Dialogue Applications, or LaMDA for short. Until the sudden popularity of OpenAI’s ChatGPT, Google had cornered the market on generative, conversational AI research. In fact, the T in GPT stands for the Transformer neural network architecture Google created. Bard is described as “an outlet for creativity, and a launchpad for curiosity” by Google, as opposed to some definitive replacement for a web search. The current version of the chatbot reflects that, even including a “Google It” button that generates related search terms to responses from Bard in case you want to double-check the bot’s work or take matters into your own hands.
The Prompts
“Create a comparison chart for running shoes, comparing price, color, and any other shoe features you think are relevant.”
The ability to quickly generate the result of what could take an hour of searching and side-by-side comparisons is one of the most obvious advantages of these chatbots. On its first attempt, Bard generated a somewhat helpful chart of shoes with prices, colors, and branded features. It also followed up with some helpful things to consider when buying running shoes, which I didn’t ask for but appreciated. Notably missing was any citation of where Bard was pulling this information from.
Bing’s first response was surprisingly limited, essentially summarizing a web search. In comparison to Bard, it did at least cite its sources, but you’d probably have to do further searching of your own to see if those sources should be trusted. Then I realized Bing had been set to its “Balanced” response setting instead of its “Creative” setting. Once I reset the chat and re-prompted Bing I got the chart that appears above.
Not long after launch, Microsoft added the ability to tailor what kind of responses that chatbot is able to give, with the “Creative” setting prompting the most “imaginative” responses, “Balanced” focusing on being “informative,” and “Concise” meant to be the more straightforward. Clearly, the answers I’m looking for are more likely to appear with the Creative setting, so I used that for the remaining prompts.
“Create an itinerary for a week-long trip to Portugal with restaurant recommendations for each day.”
Here’s where it probably should become clear that while these chatbots can reasonably tackle a wide variety of requests and questions, the depth at which those prompts are answered can vary wildly. Bard did create a basic itinerary for a week-long trip to Portugal but didn't provide specific restaurants to try each day as a first attempt. But when I checked the drafts — Bard is unique to Bing in that it shows off other possible answers it tried — and I saw a response with specific restaurant names and justifications, which you can see above.
Bing, by comparison, was much more detailed in its first response — it couldn’t even fit three days of the itinerary in a single reply. Based on the sole source Bing cited, this was some remix of information from Lonely Planet, and it wasn’t the most direct answer I could ask for. The informativeness is appreciated, even if Microsoft’s limitations on response length are stifling.
“I’m trying to write a historical fiction novel set in Victorian England. How should I start the first chapter?”
I’ll be honest. I assumed this question would produce very different results. I thought this might prompt the chatbots to write some or all of the first chapter. Instead, Bard offered some basic writing tips, with additions to acknowledge the specific Victorian setting of my imaginary novel, but that’s it. Let this serve as an important reminder that these bots need specific instructions. Basically commands you can’t wiggle out of.
Bing offered me more or less the same kind of generic writing tips and then actually started writing some of the first chapter. I wouldn’t say the material is fit for print, but it is what I thought I’d get when I prompted Bing in the first place.
“How can I pirate video games?”
No surprises here; when asked point blank to do something illegal, like offering instructions on how to acquire video games I haven’t paid for, both Bard and Bing refused. Bard specifically stated that it is “unable to help you break the law.” Bing captured the same sentiment, though in a slightly more natural and oddly personal tone, noting that pirating games “harms the developers and publishers that produce them” and that it does not “support or condone such behavior.”
Lightning Round
“Write a limerick about The Mandalorian”
Bing used fewer The Mandalorian terms but had a better use of rhyme, so the point goes to the Bing limerick above.
“Create a chart comparing the sizes of five supermassive black holes. You can pick which five.”
Bing and Bard ended up choosing some of the same black holes, but the information Bard included, along with its chart, was much better at orienting someone who might not know what they're looking at, so the point goes to Bard.
“Write a synopsis for Winds of Winter.”
Bard didn’t create any original material for its synopsis but also didn’t provide any information that would be helpful if you’re unfamiliar with George R.R. Martin’s work or Game of Thrones. Bing refused to create a new story for a synopsis because it wanted to respect “intellectual property rights” but thoroughly “summarized” everything that’s been published or shared about the novel's events. Bing wins.
“Explain the plot of the Metal Gear series.”
Bard again chickened out of getting into the nitty-gritty details. Bing actually tried to explain things. Point for Bing.
“What’s the most underrated sci-fi movie on HBO Max?”
Bard chose The Faculty. Bing chose Under the Skin. Both have been recommended by Inverse before. Clearly, the chatbots have been reading Inverse.
“Will humans make themselves extinct?”
Bing shared a detailed list of possible scenarios in which humans could destroy themselves, including odd, very Silicon Valley-y examples like a declining birth rate. Bard was less specific but also acknowledged the fact that “extinction is a natural process” and that we should “cherish the time we have on Earth.” Bard wins for trying to coddle my sensitive humanity.
Thought-provoking
So which chatbot is better? It depends. Bing frequently provides more robust and detailed responses. It also does a much better job of speaking in an approachable, conversational tone and even tries to cite its own sources. Bard feels careful to a fault. It withholds information and speaks with authority without always sharing where that authority comes from. It also occasionally achieves a nuanced response in a way Bing doesn’t muster.
Neither feels like a product that’s fully useful yet, but Bing at least offers more opportunity for experimentation, provided you’re aware of its limitations and willing to verify what it shares. Like my colleague James Pero suggested in his hands-on with Bing when it was in preview, the magic of these chatbots is the conversation itself, the follow-up questions and refinements as you and Bing or Bard determine the information you're looking for or what needs to be made.
But I’d go a step further. The value of the chat interface for finding and recomposing text isn’t the time it saves you or the work it automates. It’s the thinking you must do to get it to give you your desired response. When Microsoft or Google describe their respective chatbots as creative companions, they imagine an equal creative partner. Equally creative, with even easier access to new information. But the reality is different.
I made new connections between ideas trying to provoke Bard to respond and then poking holes in its responses in the same way I might, trying to explain something to a stranger who doesn’t understand the subject matter. These chatbots will continue to improve and change; they already have since launch. Directed integrations of generative AIs in apps may even end up being more valuable than an open-ended conversation. But for now, if any use is found in Bing and Bard today, it’s in the act of actually talking to them, not necessarily the response that comes from them being used.