UK AISI: Claude Mythos Preview achieves 73% on expert cyber tasks — first model to complete a full network attack

The UK AI Safety Institute (AISI) has published a comprehensive evaluation of the cyber capabilities of Anthropic’s latest model, Claude Mythos Preview. The results show a significant leap in AI systems’ ability to autonomously conduct cyber attacks under controlled conditions.

Key results

On expert-level Capture-the-Flag (CTF) tasks, Mythos Preview achieved a 73% success rate — on tasks that no model before April 2025 could solve. This represents a dramatic advance over previous model generations.

An even more impressive result comes from “The Last Ones” (TLO) cyber range — a simulation of a 32-step attack on a corporate network encompassing all phases from reconnaissance to full network takeover. It is estimated that a human expert would need approximately 20 hours to complete it. Mythos Preview successfully completed all 32 steps in 3 out of 10 attempts, averaging 22 steps overall. For comparison, Claude Opus 4.6 averaged 16 steps.

Important caveats

AISI emphasizes key limitations of the evaluation: the test environments lack defensive mechanisms such as active defenders, endpoint detection systems, and incident response teams. This makes the test systems “easier targets” than real hardened networks.

The institute recommends that organizations focus on cybersecurity fundamentals — regular patching, robust access controls, and implementation of the UK NCSC’s Cyber Essentials scheme. Future testing will focus on defended environments with active monitoring.

UK AISI: Claude Mythos Preview achieves 73% on expert cyber tasks — first model to complete a full network attack

Key results

Important caveats

Sources

Related news