AI agents leak private data in 68% of cases

AgentCIBench is a new benchmark testing whether computer-use agents respect contextual integrity — the principle that personal data is shared only in appropriate contexts. Of 15 frontier agents tested, 11 leak private data in more than 50% of scenarios, with an average leakage rate of 67.9%.

Researchers Anmol Goel and Iryna Gurevych from TU Darmstadt University have published a paper revealing alarming security vulnerabilities in nearly all leading computer-use agents — systems that manage email, calendars, and the desktop on behalf of users.

What is AgentCIBench and what does it measure?

AgentCIBench is an evaluation framework that tests whether AI agents respect contextual integrity — a privacy principle requiring that personal data be shared only in the context in which it was originally collected. For instance, a health detail from an email should not end up in a calendar entry visible to colleagues, nor should personal financial information appear in an automated reply to a business contact. The benchmark simulates realistic personal-app usage scenarios and measures how often agents cross this boundary.

Do agents violate privacy — and how often?

Yes, and to a significant degree. Testing 15 frontier agents revealed that 11 of 15 leak private data in more than 50% of scenarios, with an average leakage rate of 67.9%. For comparison, typical false positive rates in data-filtering security systems are below 5% — here we are talking about a systemic failure, not an edge case. Particularly concerning: these failures also appear in end-to-end tasks, meaning that real workflows provide no additional protection.

Three leakage patterns to know

Researchers identify three distinct failure mechanisms. Visual co-location occurs when the agent retrieves data that happens to be visually close to the requested UI elements — such as a private message visible in a sidebar. Task-ambiguity overshare happens when a vague user query causes excessive sharing of personal information because the agent does not know where relevance ends. Recipient misalignment describes scenarios where the agent sends inappropriate data to the wrong recipient — for example, an internal note to an external client.

Implications for security and development

The authors call for contextual privacy testing to be introduced as a mandatory pre-deployment step for AI agents that access personal data. AgentCIBench has been released as an open tool so the community can standardize this type of risk assessment. The paper was submitted on June 22, 2026.

Frequently Asked Questions

What is contextual integrity and why does it matter for AI agents?

Contextual integrity is a privacy principle stating that information may only be shared in the context in which it was originally collected — for example, medical data should not end up in a business email. Computer-use agents that access calendars, inboxes, and files violate this principle when they transfer data from one context to another without authorization.

What are the specific ways agents leak data?

Researchers identify three main patterns: visual co-location (the agent retrieves forbidden data that is visually adjacent to requested elements), task-ambiguity overshare (a vague query causes excessive sharing of personal information), and recipient misalignment (inappropriate content is sent to the wrong recipient).

arXiv:2606.23189: 11 of 15 AI agents leak private data in more than half of scenarios

What is AgentCIBench and what does it measure?

Do agents violate privacy — and how often?

Three leakage patterns to know

Implications for security and development

Frequently Asked Questions

Sources

Related news