Google Gemini Live Can Now Respond Based on What’s on Your Screen or in Front of You

Google has introduced live video and on-screen query capabilities for its Gemini Live AI assistant, allowing real-time AI processing via camera and screen analysis on Android.

At Mobile World Congress (MWC) 2025, Google has announced a major expansion of its Gemini Live AI assistant, introducing live video and on-screen query capabilities. These new features allow users to interact with Gemini using their smartphone’s camera feed and on-screen content, enabling real-time, multimodal AI assistance.

The update, which is set to roll out to Google One AI Premium subscribers later this month, marks a significant step in Google’s push toward making AI a core element of the Android ecosystem. It follows a series of recent enhancements, including deeper research capabilities, memory recall, and an expansion of Gemini AI into Google Workspace.

How the New Features Work

Google’s latest Gemini AI update expands its capabilities by integrating Live Video AI Queries and On-Screen AI Interaction, two features designed to bring real-time intelligence to everyday mobile experiences.

They allow users to interact with Gemini in a more natural and intuitive way, without relying solely on text-based prompts.

Live Video AI Queries

The Live Video AI Queries feature enables users to point their smartphone camera at objects, text, or scenes and ask Gemini questions based on what it sees.

Whether identifying unfamiliar landmarks, solving math problems from a written equation, or providing step-by-step guidance on repairing a household item, Gemini can process the live feed and generate relevant responses.

Live Video AI Queries build on Google’s previous AI-powered image recognition technologies but takes it a step further by making the AI assistant capable of analyzing dynamic, real-time video rather than static images.

On-Screen AI Interaction

The second feature, On-Screen AI Interaction, allows Gemini to analyze content displayed on the user’s phone screen and provide relevant information or assistance. This means users can summon Gemini while reading an article, reviewing a document, or browsing a website to get explanations, summaries, or translations without switching apps.

For example, a user reading a scientific paper can ask Gemini to simplify complex terms, while someone reviewing a contract can request a plain-language breakdown of the legal text. This seamless integration of AI into everyday browsing and work tasks eliminates the need to copy and paste content into a separate chatbot interface.

Google’s Strategy: Expanding Gemini AI Beyond Text

Google’s latest AI advancements align with its broader strategy to transform Gemini AI into a full-fledged research and productivity tool.

Earlier in February, Google added its Deep Research feature to the Gemini Android app,  which allows Gemini Advanced users to conduct structured investigations by compiling and analyzing multiple sources.

This shift toward multimodal AI interactions—incorporating Google’s Gemini 2.0 Flash Thinking model—suggests a growing emphasis on making AI a real-time, interactive assistant rather than just a chatbot.

The focus on AI-driven mobile experiences at MWC 2025 places Google in direct competition with Apple’s upcoming AI initiatives for iOS 18 and OpenAI’s continued push with new capabilities in ChatGPT, like the live video support for Advanced Voice Mode it added last December.

Google’s Gemini AI, now integrated into Android’s core functionality, aims to position itself as a default AI companion for mobile users.

Beyond consumer applications, Google’s AI capabilities extend to productivity tools. Just days before MWC, Gemini AI was integrated into Google Sheets, enabling automatic data analysis and visualization—a move that aligns with Microsoft’s AI-powered Excel Copilot.

From AI Research to Real-Time Interaction: Gemini’s Evolution

Google’s latest AI advancements are the result of months of steady improvements to Gemini’s capabilities. The introduction of Gemini 2.0 Pro and Flash-Lite in early February brought significant technical improvements, particularly in reasoning and memory. The models now support a two-million-token context window, enabling Gemini to process far more information in a single session.

Additionally, Google has positioned Gemini as a long-term AI research tool. The Deep Research feature, which allows structured multi-source investigations, and Gemini’s memory recall update both emphasize AI’s role beyond simple chatbot interactions.

By incorporating real-time video and screen analysis, Google is bridging the gap between research-oriented AI tools and everyday user interactions.

Google’s introduction of real-time AI queries through live video and screen-based interactions could be a precursor to broader AI-powered device capabilities.

The move suggests an eventual shift toward AI-enhanced augmented reality (AR) applications, where users interact with the world around them through intelligent overlays.

Addressing the Challenges of Real-Time AI

While Google’s latest Gemini features mark an important step in AI’s evolution, real-time AI interactions bring new challenges. The ability to process live video raises concerns about privacy, security, and accuracy.

Ensuring that AI-generated responses are both reliable and free from bias remains a crucial challenge for Google and its competitors. Meta’s AI powered Ray-Ban smart glasses last year faced some backlash after two Harvard students demonstrated how combined with facial recognition software, they can quickly reveal people’s personal details in real-time

 

 

Markus Kasanmascheff
Markus Kasanmascheff
Markus has been covering the tech industry for more than 15 years. He is holding a Master´s degree in International Economics and is the founder and managing editor of Winbuzzer.com.

Recent News

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
0
We would love to hear your opinion! Please comment below.x
()
x