🟡 🏥 In Practice Published: · 2 min read ·

Microsoft: Talos — open-source automated iterative genomic reanalysis for rare diseases

Editorial illustration: genomic data pipeline with a DNA helix and medical charts on a dark background, no faces

Talos is an open-source system that automatically and iteratively reanalyzes genomes of patients with rare diseases. Applied to 4,735 undiagnosed patients, it found 241 new diagnoses (+5.1%), with 90% sensitivity and an annotation cost of just $11 per 1,000 genomes.

🤖

This article was generated using artificial intelligence from primary sources.

Microsoft Research, in collaboration with the Centre for Population Genomics, Australian Genomics, and the Broad Institute, published Talos — an open-source system for automated iterative reanalysis of genomes of patients with rare diseases, published in Nature Medicine.

What is genomic reanalysis and why does it matter?

Genomic reanalysis means re-examining an already-sequenced genome using updated knowledge bases and algorithms. Because the medical literature on rare diseases grows constantly, variants that have been unknown for years can gain diagnostic value today. Talos automates and scales this process: annotating 1,000 genomes costs only $11, which is many times cheaper than a manual approach and makes continuous reanalysis financially feasible.

Results: 241 new diagnoses from 4,735 patients

On a cohort of ~1,100 patients, Talos achieves 90% sensitivity with only 1.3 candidate variants per patient for clinical review — dramatically fewer than the hundreds of candidates generated by older tools. Applied to 4,735 previously undiagnosed patients, it identified 241 new diagnoses, representing an increase of +5.1%. On average it took 32 days from the appearance of a relevant record in a public database to diagnosis confirmation.

Superiority over Exomiser

The previous standard in automated prioritization of genomic variants was Exomiser. Talos statistically significantly outperforms it in top-1 ranking of correct variants (p<0.0001), with a drastically reduced number of candidates a clinician must manually review. The difference is not marginal — 1.3 vs. multiple dozens of variants per patient directly impacts clinical efficiency.

Open-source availability

Talos is publicly available on GitHub (populationgenomics/talos), meaning hospitals and research centers can adapt it to their own cohorts without licensing costs. Unlike RaDaR (a reasoning LLM for rare diseases), Talos is not a language model but a genomic pipeline — a complementary approach that operates at the level of DNA variants rather than clinical text.

Frequently Asked Questions

What is genomic reanalysis and why is it important for rare diseases?
Genomic reanalysis is a re-examination of an already-sequenced patient genome using updated knowledge bases and algorithms — particularly valuable because the medical literature on rare diseases grows rapidly, meaning variants that were previously unknown may now have diagnostic value.
How does Talos differ from Exomiser, the previous standard?
Talos statistically significantly outperforms Exomiser in top-1 ranking of correct variants (p<0.0001), with a drastically smaller number of candidate variants per patient — only 1.3 instead of multiple dozens — greatly reducing the review burden for clinicians.