BioMysteryBench: Claude Mythos Preview Solves Bioinformatics Problems Even Experts Cannot, Opus 4.6 Achieves 77.4% on Human-Solvable Tasks
Anthropic released BioMysteryBench on April 29, 2026 — an evaluation framework of 99 expert-level bioinformatics tasks with objective ground truth derived from experimental data. Claude Opus 4.6 achieves approximately 77.4% accuracy on 76 human-solvable problems and 23.5% on 23 superhuman tasks, while Mythos Preview solves some problems that a panel of human experts could not — researchers describe this as a watershed moment for AI in bioscience.