Google is giving its artificial intelligence chatbot Bard a major upgrade, adding the ability to generate images from a text prompt for the first time.
Bard was upgraded in December to run on top of the Gemini Pro large language model (LLM), which is capable of higher levels of reasoning and recently saw the chatbot climb to second place in a widely recognized AI chatbot leaderboard, just behind the most advanced OpenAI model.
The new image generation capabilities don’t come from Gemini, rather the images are created using Google’s new Imagen 2 model built by DeepMind, Google’s advanced AI lab.
In a bid to combat the spread of misinformation and deep fakes, Google says any image generated by Bard will also be tagged with SynthID. This is a tool also built by DeepMind that adds a hidden watermark in the image pixels that confirms it is an AI-generated picture.
How will image generation work in Bard?
Google says Imagen 2 delivers the highest text-to-image quality yet, includes improvements around removing visual artefacts and responds better than the previous generation Imagen model to text prompts and instructions.
Much like DALL-E 3 in ChatGPT or Image Creator in Microsoft Copilot, you generate images in Bard with a simple description.
For example, you could type “create an image of a dog riding a surfboard” and Bard would create a range of choices for you to select.
Jack Krawczyk, Product Lead for Bard said they’ve also been working behind the scenes on the underlying model to ensure it generates safe and suitable images.
This is similar to the guardrails in place for DALL-E in ChatGPT and other AI image-generation tools including Adobe Firefly.
"Our technical guardrails and investments in the safety of training data seek to limit violent, offensive or sexually explicit content,” Krawczyk said, adding that “we apply filters designed to avoid the generation of images of named people.
What isn’t clear yet is whether image generation will come to Assistant when it has Bard incorporated later this year, although it seems like a logical inclusion for Google.
What else is coming to Bard?
When Google added Gemini Pro to Bard in December it was restricted to a handful of countries and languages. This new update makes it available in over 40 languages and across 230 countries and territories.
It works natively across different languages for text, coding and reasoning abilities although image generation is English only at the moment.
Bard's "double-check" feature is also being expanded to other languages. This is where you can click the G icon after Bard generates a response and check what the chatbot has said is correct. This is in part to combat the hallucination problem that plagues all large language models.
If you don’t want to use Bard for some reason or prefer standalone tools, then Google is also releasing ImageFX, an experimental standalone image generator built on the Imagen 2 model through its Labs service. Imagen 2 will also power Duet AI in Workspace.