Google: Computer Use in Gemini 3.5 Flash — agents for browser, mobile, and desktop
Google has integrated the Computer Use tool into Gemini 3.5 Flash, enabling AI agents to autonomously control browsers, mobile devices, and desktop applications. The model achieves the best OSWorld score to date, with enterprise protections against prompt injection attacks.
This article was generated using artificial intelligence from primary sources.
Google announced the integration of the Computer Use tool directly into Gemini 3.5 Flash, bringing the ability to control computer interfaces — previously reserved for the standalone Gemini 2.5 — to a significantly more accessible and faster model.
What are computer use agents?
Computer use agents are AI systems that not only answer questions but autonomously control a computer’s graphical interface: they open applications, click buttons, fill out forms, and complete multi-step tasks in browsers, mobile devices, and desktop environments. Unlike classic chatbots that generate text, these agents execute actions in a real digital environment.
Gemini 3.5 Flash vs Gemini 2.5 — expanding access
The key change is not a technical innovation but a democratization: Computer Use was previously available exclusively in the standalone Gemini 2.5 model. Integrating it into Gemini 3.5 Flash — optimized for speed and cost-efficiency — means enterprise teams and developers can run agentic workflows at significantly lower cost per token.
On the OSWorld benchmark, a standardized test measuring AI agents’ ability to execute tasks in real operating systems, Gemini 3.5 Flash with Computer Use achieves the best result to date recorded for agentic tasks in Google’s models. OSWorld includes scenarios such as web browsing, file manipulation, and work with office applications — making it more relevant than synthetic tests.
Supported environments and enterprise protections
The model supports three classes of environments: browser (web applications and pages), mobile (Android and iOS interfaces), and desktop (Windows, macOS, Linux applications). A demo integration is available via the Browserbase platform.
Security was the central challenge for computer use agents due to prompt injection attacks — situations where malicious content on screen (e.g. hidden text on a web page) attempts to take control of the agent and force it to perform unauthorized actions. Google applied adversarial training in which the model was exposed to thousands of simulated injection scenarios. In addition, the system requires explicit user confirmation before sensitive actions and automatically halts execution upon detecting manipulation.
Availability
Computer Use in Gemini 3.5 Flash is available in the Gemini API and the Google Enterprise Agent Platform. Developers can start building agentic applications immediately without waiting for access to the premium Gemini 2.5 tier.
The move clearly signals Google’s direction: computer use agents are not an experimental feature but are becoming a standard part of production AI infrastructure.
Frequently Asked Questions
- What are computer use agents and how do they differ from classic AI chatbots?
- Computer use agents are AI systems that can autonomously control a graphical computer interface — clicking, typing, scrolling, and executing tasks in real applications without human intervention at each step.
- How does Google protect users from prompt injection attacks in Computer Use?
- Google applies adversarial training, requires explicit user confirmation for sensitive actions, and has introduced automatic execution termination as soon as the system detects a prompt injection attempt.
Related news
Anthropic: Claude Code v2.1.191 — /rewind command, 37% less CPU, MCP retry logic
LangChain: how to give an AI agent memory — capture, analyze, update via LangSmith
Anthropic: Claude Code v2.1.187 — sandbox credentials protection, org model restrictions, CJK fix