Google has launched a new feature called Gemini Live for its Gemini AI chatbot assistant. Announced during the Pixel 9 event, this voice chat mode is now available for Gemini Advanced subscribers, enabling real-time voice conversations similar to ChatGPT’s capabilities.
Interactive Conversations in Real-Time
Gemini Live allows users to carry on natural dialogues, with functionalities such as interrupting responses and pausing interactions for later resumption. The feature operates even when the device is locked or multitasking. Initially demonstrated at the I/O developer conference earlier this year, Gemini Live can also interpret video content in real time.
Users can personalize the experience by choosing from ten different voices. Currently, the feature is exclusive to Android devices in English, with future plans to expand to iOS and additional languages.
Additional AI Features
Google has introduced other new functionalities for its AI assistant alongside Gemini Live. These include extensions for Keep, Tasks, Utilities, and YouTube Music apps. Gemini will also feature screen-aware capabilities, akin to those introduced by Apple at WWDC.
Users can receive detailed information about what’s on their screen by selecting options like “Ask about this screen” or “Ask about this video,” such as extracting destinations from travel videos to add to Google Maps.
Performance and User Feedback
TechCrunch tested Gemini Live at the Made by Google event. The feature provided responses in less than two seconds and adjusted seamlessly when interrupted. Though there were occasional shortcomings, it proved effective for hands-free phone operation. Google has collaborated with voice actors to create diverse and humanlike voice options for this feature.
In one demonstration, a Google product manager used Gemini Live to find family-friendly wineries near Mountain View with outdoor areas. Gemini suggested Cooper-Garrod Vineyards in Saratoga but included incorrect information about a nearby playground. The feature also experienced some overlap between AI and user speech. Additionally, Gemini Live lacks capabilities such as singing or mimicking non-standard voices, likely due to copyright concerns.
Future Developments and Comparisons
Gemini Live serves as Google’s answer to OpenAI’s Advanced Voice Mode, which is currently in limited alpha testing. While OpenAI introduced the feature first, Google is the first to make a completed version available to users. Unlike OpenAI, Google is not focusing on detecting emotional intonations in the user’s voice. The development aligns with Google’s ongoing Project Astra, aiming for a fully multimodal AI model.
Last Updated on November 7, 2024 3:19 pm CET