Google Tests “Computer Use” AI Agent in AI Studio, Potentially Streamlining Gemma 3 Deployment

Google's AI Studio may soon feature "Computer Use" for AI agents, as suggested by recent findings, alongside streamlined Gemma 3 and Cloud Run integration.

The tech industry’s pursuit of AI that can actively operate computers, not just respond to queries, is seeing another potential entrant, as signs point to Google exploring a “Computer Use” function within its AI Studio.

This development, hinted at by code traces briefly appearing on May 5,suggests Google is looking to equip its developer platform with tools for AI agents to interact directly with graphical user interfaces and native applications, a field already active with competitors like Microsoft, Anthropic, and OpenAI.

Google’s Foray into Agentic AI

While Google has not made a formal announcement, the “Computer Use” label aligns with industry terminology for AI systems capable of observing screen content, controlling cursors, and inputting text.

This capability appears connected to Google’s lighter-weight Gemma 3 models, as a system message accompanying the code sighting stated, “Gemma 3 will be deployed as a Cloud Run service in your GCP project. Update your SDK to point to the Cloud Run endpoint.”

Such an integration could allow developers using AI Studio to deploy containerized Gemma instances with relative ease, potentially with a single click. Cloud Run already supports GPU-backed serverless inference and scales to zero when idle, making it a suitable environment for such models. Gemma 3 models are Google’s open, lightweight models designed for efficiency, often capable of running on a single GPU or TPU.

Official Google documentation from March confirms Gemma 3’s compatibility with AI Studio and lists Cloud Run, alongside Vertex AI and Google Kubernetes Engine (GKE), as supported deployment targets.

The documentation also points to existing tutorials showing how to package Gemma using frameworks like vLLM or Ollama and expose public HTTPS endpoints, a process that direct integration into AI Studio could further simplify. Integrating such controls could provide AI Studio with an orchestration layer and a local execution sandbox, allowing some tasks to run on-device while more demanding computations are handled remotely, potentially shortening the distance between prompt design and a live API for developers.

This isn’t Google’s first exploration into AI agents controlling digital environments. “Project Mariner,” an early research prototype using Gemini 2.0, was detailed by Google in December as an AI agent that can understand and reason across browser screen information, including pixels, text, and forms. Google stated that Mariner, as a single agent setup, “achieved a state-of-the-art result of 83.5% on WebVoyager.”

Mariner was previously known internally as “Project Jarvis” and was briefly leaked on the Chrome Web Store in November 2024, described then as a companion for web surfing. The AI Studio platform itself has been evolving, with features like Gemini 2.5 Pro integration and screen sharing capabilities since May 3, making it a logical home for more advanced agentic tools.

The Competitive Field of Computer-Controlling AI

Google’s potential move follows several other companies that have already introduced or are developing similar AI functionalities. Microsoft began previewing a “computer use” feature in its Copilot Studio in April, targeting enterprise automation by enabling AI to simulate human actions on desktops and web apps. Charles Lamanna, Microsoft’s Corporate Vice President for Business & Industry Copilot, remarked at the time, “If a person can use the app, the agent can too.”

Anthropic was earlier to the scene, updating its Claude 3.5 Sonnet model in October 2024 with an API-based “Computer Use” feature, allowing developers to direct the AI in tasks involving screen interaction and control. Early adopters like Asana and DoorDash reportedly used this for multi-step processes, though the feature was described as experimental and sometimes prone to errors at launch.

OpenAI introduced its “Operator” agent in January 2025 for ChatGPT Pro subscribers, a browser-based tool using a Computer-Using Agent (CUA) model that interprets screenshots to interact with websites, requiring user confirmation for actions. By February, OpenAI expanded Operator’s availability.

A Reality Check on Agent Performance

Despite the advancements, the practical effectiveness of current AI agents in handling complex professional duties autonomously is still under scrutiny. A Carnegie Mellon University study published on May 5, titled “TheAgentCompany,” provided a sober assessment. Simulating a software firm, the study found that even the leading AI, Anthropic’s Claude 3.5 Sonnet, only fully completed 24% of assigned tasks, at an average operational cost of over $6 per task. Google’s Gemini 2.0 Flash managed 11.4% completion, while OpenAI’s GPT-4o achieved 8.6%.

The researchers highlighted “a lack of common sense, poor social skills, and incompetence in web browsing” as common issues. For instance, agents struggled with basic file understanding or dismissing simple on-screen pop-ups.

The study also noted better performance in software development tasks compared to administrative or financial roles, possibly due to the larger corpus of public code available for training. The conclusion was that while AI agents can assist with parts of human work, they are “likely not a replacement for all tasks at the moment.” This presents a notable performance benchmark that any new “Computer Use” feature from Google would implicitly be measured against.

The Autonomous Frontier and Its Implications

Beyond tools that assist or automate under supervision, the industry is also seeing the rise of more independent agents. Manus AI, from Chinese startup Butterfly Effect, launched around March 6, and is marketed as an autonomous agent capable of planning and executing digital tasks without constant human oversight, reportedly using models like Anthropic’s Claude and Alibaba’s Qwen.

After an initial period of high demand for invite codes, Manus AI introduced paid subscription plans on March 31, 2025. However, the autonomous nature of such agents has quickly drawn attention from regulators, with Manus AI facing bans on state networks in Tennessee and Alabama due to security and propaganda concerns.

Google’s own broader strategy includes significant investment in AI agents, with Google Cloud referring to multi-agent AI systems as the “next frontier” and announcing new tools in Vertex AI at its Cloud Next conference in April 2025. As Google potentially readies a “Computer Use” feature for AI Studio, its success will depend not only on the technical capabilities and ease of deployment for models like Gemma 3 but also on addressing the reliability and safety considerations that are becoming increasingly prominent in the field of AI-driven computer operation.

The fleeting code commit suggests that Google is actively working to blend desktop-level control with serverless model hosting, potentially turning AI Studio into a more comprehensive platform for developing with Gemma and future models. Whether this “Computer Use” feature ships broadly or remains an internal experiment will likely depend on these ongoing safety evaluations and Google’s evolving agent strategy.

Markus Kasanmascheff
Markus Kasanmascheff
Markus has been covering the tech industry for more than 15 years. He is holding a Master´s degree in International Economics and is the founder and managing editor of Winbuzzer.com.

Recent News

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
0
We would love to hear your opinion! Please comment below.x
()
x