Table of Contents
Google is gearing up to make some big changes to its AI chatbot, which was previously known as Bard. The chatbot, which can generate images and code based on text prompts, will soon be renamed Google Gemini, according to a leaked changelog. The new name reflects Google’s intention to integrate its large language model (LLM) Gemini across its products and services.
What is Google Gemini?
Google Gemini is the name of the LLM that powers the chatbot formerly known as Bard. Google launched Gemini last year, claiming that it can perform complex tasks like logical reasoning, coding, following nuanced instructions, and creative collaboration. Gemini has three versions: Gemini Nano, Gemini Pro, and Gemini Ultra, each with different capabilities and features.
Google Gemini is similar to Microsoft’s Copilot, which is also an AI chatbot that can generate images and code. Both chatbots use text prompts to create AI-generated content, but they have different underlying models. Copilot uses OpenAI’s GPT-4, while Gemini uses Google’s own LLM.
What are the new features and plans for Google Gemini?
According to the leaked changelog, Google plans to release some updates for Google Gemini on February 7. The updates include:
- Rebranding the chatbot from Bard to Gemini, to align with Google’s LLM strategy.
- Introducing a paid ‘Advanced’ tier powered by Gemini Ultra will offer enhanced multi-modal capabilities, improved coding support, and the ability to explore and analyze files and documents in greater depth.
- Expanding the availability of Gemini to Canada, making it the first country outside the US to access the chatbot since its initial launch.
- Launching a dedicated Gemini app for smartphone users, which will let them use Google AI on their phones for various tasks, such as learning, writing, and planning. Gemini will work with other Google services, such as Gmail, Maps, and YouTube. Android users will get a separate app for Gemini, while iOS users may be able to access it through the Google app.
- Supporting more languages, including Japanese, Korean, and English, in most parts of the world, except for some European countries and regions. Google plans to add more countries and languages to Gemini soon.
However, these updates are not confirmed by Google yet, and we can expect the official announcements this week. Google Gemini is an ambitious project that aims to bring the power of Google AI to everyone, and we are curious to see how it will evolve and improve over time.
How does Google Gemini generate images and code?
Google Gemini is a multimodal AI model that can generate images and code using text prompts. It employs a novel architecture that integrates a multimodal encoder and decoder. The encoder transforms various data types, including text, images, video, audio, and code, into a common language that the decoder can understand. The decoder then produces outputs in different modalities based on the encoded inputs and the task at hand.
For example, if the user enters a text prompt describing an image, such as “a cat wearing a hat”, Gemini will encode the text into a latent representation and then decode it into an image output. Similarly, if the user enters a text prompt describing a code task, such as “write a function to reverse a string in Python”, Gemini will encode the text and then decode it into code output. Gemini can also generate text outputs from image or code inputs, as well as combine and match different modalities.
Gemini can produce images natively rather than relying on an intermediate natural language description, which can limit the model’s ability to express images. This allows the model to generate images with prompts from interleaved image and text sequences in a few shots. Gemini can also generate code in a variety of programming languages, including Python, Java, C++, and HTML, and it can perform complex tasks such as logical reasoning, coding, following nuanced instructions, and creative collaboration.
Google Gemini is accessible via Google Bard, an AI chatbot that was previously powered by a different model. To use Google Bard, the user must have a Google Workspace account with Bard access enabled and be at least 18 years of age.