ArXiv: Bans Work, Instructions Backfire — Empirical Study of Rules for AI Coding Agents
Why it matters
An analysis of 679 rule files and 25,532 rules from GitHub shows that prohibitions improve AI coding agents, but positive instructions actually hurt them. Random rules perform just as well as expertly written ones.
A large empirical study with over 5,000 AI agent runs on the SWE-bench Verified benchmark delivers a surprising finding: rule files such as CLAUDE.md or .cursorrules don’t function the way developers think they do.
What did the researchers discover?
The research team analyzed 679 rule files collected from GitHub, containing a total of 25,532 individual rules. They tested how these rules affect the performance of AI coding agents.
Key findings:
- Rules overall deliver a 7-14 percentage point improvement in task completion
- Randomly generated rules perform just as well as expertly written ones — suggesting that context “priming” is at play, not specific instructions
- Prohibitions (negative constraints like “never do X”) improve performance when applied individually
- Positive instructions (prescriptions like “always use approach Y”) actively hurt performance — the agent makes more mistakes than with no rules at all
Why does this matter?
Millions of developers today use CLAUDE.md, .cursorrules, and similar rule files to guide AI assistants. This study suggests that the approach of “tell the agent what NOT to do” is far more effective than “tell it how to work.”
The researchers recommend: constrain what the agent must not do, rather than prescribing what it should. In other words, a short list of prohibitions outperforms lengthy best-practice guides.
Implications for the industry
This calls into question the popular practice of writing detailed rules for AI agents. It appears that agents perform better with clear boundaries than with detailed instructions — similar to human teams that also function better with clear constraints than with excessive micromanagement prescriptiveness.
This article was generated using artificial intelligence from primary sources.
Related news
Anthropic: Memory for Managed Agents in public beta — AI agents that remember context between sessions
GitHub: Cloud agent sessions now available directly from issues and project views
ArXiv SWE-chat — a dataset of real developer interactions with AI coding agents in production