DeepMind AI co-clinician: 0 critical errors in 97/98 tests

Google DeepMind announced the AI co-clinician research initiative on April 30, 2026 — a triadic care model in which an AI agent assists patients under clinical oversight of a physician. In blind head-to-head evaluations of 98 realistic primary care queries, doctors consistently preferred co-clinician responses over two leading evidence synthesis tools, and the system recorded zero critical errors in 97 of 98 cases.

Google DeepMind officially announced the AI co-clinician research initiative on April 30, 2026 — a model described by its authors as “triadic care,” a paradigm in which an AI agent assists patients in their care journey under the clinical authority of their physician. The idea is to extend the reach of the medical team while ensuring the physician retains judgment and control over decisions. The initiative builds on DeepMind’s earlier work on MedPaLM (medical knowledge tests) and AMIE (simulated medical consultations with patients in feasibility studies).

What does triadic care mean in practice?

Triadic care is a triad of patient–physician–AI agent, where AI enters as a “new player on the field,” not as a replacement for the doctor. Medicine has always been a team sport, argue the authors — Alan Karthikesalingam, Vivek Natarajan, and Pushmeet Kohli — and AI agents can bring more team members into the game while the clinician still holds medical responsibility. The system was designed and tested in two separate directions: supporting the physician (clinician-facing) and communicating with the patient (patient-facing).

How did the authors measure response quality?

DeepMind adapted the NOHARM framework in collaboration with academic physicians — an approach that separately measures “errors of commission” (incorrect information) and “errors of omission” (missing critical information). In blind head-to-head evaluations, doctors consistently preferred AI co-clinician responses over those from leading evidence synthesis tools. In an objective analysis of 98 realistic primary care queries, the system recorded zero critical errors in 97 cases, improving on two AI systems widely used in clinical practice.

What about query quality and methodology?

The study used a blind comparison of 98 realistic primary care queries collected from diverse sources and subsequently refined by a panel of physicians. A multi-step iterative process involved background research and development of query-specific metrics, enabling precise measurement of consensus commission and omission errors. The goal was to ensure the evaluation reflects the complexity of real clinical decision-making, not testing the system on simplified cases.

Why is this a turning point?

Most prior medical AI results were in the mode of exam questions or simulated consultations. Co-clinician positions itself for the first time as a clinic component where the physician retains authority and the AI agent works alongside — which DeepMind considers a prerequisite for clinical adoption. The global shortage of healthcare workers, estimated by the World Health Organization at more than 10 million by 2030, makes this kind of scaling economically necessary, and the evaluation result suggests AI is no longer merely an assistant supplementing medical knowledge tests.

Frequently Asked Questions

What is the triadic care model?

An approach in which an AI agent assists patients in their care journey under the clinical authority of a physician. The doctor retains judgment and control, while AI extends the team's reach — DeepMind describes it as a new player on the field, not a replacement.

How many critical errors did the AI co-clinician make in the evaluation?

The system recorded zero critical errors in 97 of 98 realistic primary care queries, outperforming two AI systems that doctors currently use in practice.

What is the NOHARM framework?

A methodological framework for testing AI systems in medicine that separately measures errors due to incorrect information (commission) and errors due to omitting critical information (omission). DeepMind adapted it with academic physicians for the co-clinician evaluation.

DeepMind AI co-clinician: in blind evaluation of 98 primary care queries doctors preferred it over leading tools, zero critical errors in 97/98 cases

What does triadic care mean in practice?

How did the authors measure response quality?

What about query quality and methodology?

Why is this a turning point?

Frequently Asked Questions

Sources

Related news