🔴 🤝 Agents Published: · 2 min read ·

Google: Computer Use in Gemini 3.5 Flash — agents for browser, mobile, and desktop

Editorial illustration: AI agent controlling a browser and mobile interfaces across multiple screens

Google has integrated the Computer Use tool into Gemini 3.5 Flash, enabling AI agents to autonomously control browsers, mobile devices, and desktop applications. The model achieves the best OSWorld score to date, with enterprise protections against prompt injection attacks.

🤖

This article was generated using artificial intelligence from primary sources.

Google announced the integration of the Computer Use tool directly into Gemini 3.5 Flash, bringing the ability to control computer interfaces — previously reserved for the standalone Gemini 2.5 — to a significantly more accessible and faster model.

What are computer use agents?

Computer use agents are AI systems that not only answer questions but autonomously control a computer’s graphical interface: they open applications, click buttons, fill out forms, and complete multi-step tasks in browsers, mobile devices, and desktop environments. Unlike classic chatbots that generate text, these agents execute actions in a real digital environment.

Gemini 3.5 Flash vs Gemini 2.5 — expanding access

The key change is not a technical innovation but a democratization: Computer Use was previously available exclusively in the standalone Gemini 2.5 model. Integrating it into Gemini 3.5 Flash — optimized for speed and cost-efficiency — means enterprise teams and developers can run agentic workflows at significantly lower cost per token.

On the OSWorld benchmark, a standardized test measuring AI agents’ ability to execute tasks in real operating systems, Gemini 3.5 Flash with Computer Use achieves the best result to date recorded for agentic tasks in Google’s models. OSWorld includes scenarios such as web browsing, file manipulation, and work with office applications — making it more relevant than synthetic tests.

Supported environments and enterprise protections

The model supports three classes of environments: browser (web applications and pages), mobile (Android and iOS interfaces), and desktop (Windows, macOS, Linux applications). A demo integration is available via the Browserbase platform.

Security was the central challenge for computer use agents due to prompt injection attacks — situations where malicious content on screen (e.g. hidden text on a web page) attempts to take control of the agent and force it to perform unauthorized actions. Google applied adversarial training in which the model was exposed to thousands of simulated injection scenarios. In addition, the system requires explicit user confirmation before sensitive actions and automatically halts execution upon detecting manipulation.

Availability

Computer Use in Gemini 3.5 Flash is available in the Gemini API and the Google Enterprise Agent Platform. Developers can start building agentic applications immediately without waiting for access to the premium Gemini 2.5 tier.

The move clearly signals Google’s direction: computer use agents are not an experimental feature but are becoming a standard part of production AI infrastructure.

Frequently Asked Questions

What are computer use agents and how do they differ from classic AI chatbots?
Computer use agents are AI systems that can autonomously control a graphical computer interface — clicking, typing, scrolling, and executing tasks in real applications without human intervention at each step.
How does Google protect users from prompt injection attacks in Computer Use?
Google applies adversarial training, requires explicit user confirmation for sensitive actions, and has introduced automatic execution termination as soon as the system detects a prompt injection attempt.