Table of Contents
Google has just announced Gemini 1.5, the successor to its large language model Gemini, which it launched two months ago. Gemini 1.5 is a powerful AI tool that can be used for various purposes, such as business analytics, personal assistance, and content creation. Google is making Gemini 1.5 available to developers and enterprise users first, before rolling it out to consumers.
What’s new in Gemini 1.5?
Gemini 1.5 has several improvements over Gemini 1.0, including:
- Performance: The latest Gemini version, the general-purpose model, is as good as Gemini Ultra, the high-end model, and outperforms Gemini 1.0 Pro on 87 percent of benchmark tests.
- Efficiency: It uses a technique called Mixture of Experts (MoE), which means it only runs the relevant part of the model for each query, instead of processing the entire model all the time. This makes the model faster and more efficient for both users and Google.
- Context window: It has a huge context window of 1 million tokens, which means it can handle much larger queries and data than Gemini 1.0, which has a context window of 32,000 tokens. OpenAI’s GPT-4 has a context window of 128,000 tokens for comparison.
- Tokens are units of text that the model uses to understand and generate language. Google CEO Sundar Pichai simplifies it by saying: “It’s about 10 or 11 hours of video, tens of thousands of lines of code.” The context window allows users to ask the AI bot about all of that content at once.
What can Gemini do?
Pichai is especially excited about the potential of Gemini 1.5’s context window. He says it enables use cases where users can add a lot of personal context and information to their queries. “Think of it as we have dramatically expanded the query window,” he says.
For example, Pichai imagines filmmakers might upload their entire movie and ask Gemini what reviewers might say; he sees companies using Gemini to look over masses of financial records. “I view it as one of the bigger breakthroughs we have done,” he says.
Pichai also mentions that someone in Google might have already used Gemini 1.5 to analyze the entire Lord of The Rings trilogy, to see if the AI can spot any continuity errors, understand the complex history of Middle-earth, or make sense of Tom Bombadil. “I’m sure it has happened,” Pichai laughs, “or will happen — one of the two.”
How can you access Gemini?
Gemini 1.5 is only available to business users and developers, through Google’s Vertex AI and AI Studio platforms. Google plans to replace Gemini 1.0 with Gemini 1.5 soon, and offer Gemini 1.5 Pro with a 128,000-token context window as the standard version for everyone at gemini.google.com and in its apps.
Users who want to access the full 1 million-token context window will have to pay extra. Google is also testing the model’s safety and ethical boundaries, especially with the larger context window.
Why does Gemini matter?
Google is in fierce competition to build the best AI tool in the industry, as businesses around the world are looking for their own AI solutions and choosing between different providers, such as OpenAI, Google, or others. Just this week, OpenAI announced “memory” for ChatGPT, and it seems to be preparing for a move into web search.
It is Google’s latest attempt to impress and attract users, especially those who are already in its ecosystem, but there is still a lot of work to be done on all sides.
Pichai says that eventually, users won’t care about the technical details of Gemini 1.5 or its competitors. “People will just be consuming the experiences,” he says. “It’s like using a smartphone without always paying attention to the processor underneath.”
But right now, he says, we are still in the phase where everyone knows the chip inside their phone because it matters. “The underlying technology is shifting so fast,” he says. “People do care.”