AI Cracks Rare Genetic Disease Codes

OpenAI's reasoning model helped identify diagnoses in 18 previously unsolved rare genetic disease cases, demonstrating AI's potential in re-analyzing complex medical data.

8 min read
Abstract graphic representing data analysis and genetic sequencing.
AI is being explored to help physicians diagnose rare genetic diseases.· OpenAI News

Even with advanced genomic sequencing, a significant portion of children with rare genetic diseases remain undiagnosed. Sifting through vast amounts of genetic data and evolving scientific literature presents a formidable challenge for physicians. Now, a new study published in NEJM AI details how an OpenAI reasoning model is showing promise in re-analyzing these complex cases. Researchers from Boston Children’s Hospital, Harvard University, and OpenAI utilized the o3 Deep Research model to re-examine 376 previously unsolved cases.

Visual TL;DR. Undiagnosed Rare Diseases leads to Data Overload Challenge. Data Overload Challenge uses OpenAI Reasoning Model. OpenAI Reasoning Model leads to Re-analyzing Complex Cases. Re-analyzing Complex Cases leads to Connecting Fragmented Data. OpenAI Reasoning Model led to Identified 18 Cases. Identified 18 Cases leads to AI Potential Shown.

Related startups

  1. Undiagnosed Rare Diseases: many children with rare genetic diseases remain undiagnosed
  2. Data Overload Challenge: sifting through vast genetic data and literature is difficult
  3. OpenAI Reasoning Model: o3 Deep Research model used to re-examine cases
  4. Re-analyzing Complex Cases: systematically revisit difficult cases with AI
  5. Connecting Fragmented Data: linking patient data, records, variants, and papers
  6. Identified 18 Cases: diagnoses identified in 18 previously unsolved rare genetic disease cases
  7. AI Potential Shown: demonstrating AI's potential in re-analyzing complex medical data
Visual TL;DR
Visual TL;DR — startuphub.ai Undiagnosed Rare Diseases leads to Data Overload Challenge. Data Overload Challenge uses OpenAI Reasoning Model. OpenAI Reasoning Model led to Identified 18 Cases. Identified 18 Cases leads to AI Potential Shown uses led to Undiagnosed Rare Diseases Data Overload Challenge OpenAI Reasoning Model Identified 18 Cases AI Potential Shown From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Undiagnosed Rare Diseases leads to Data Overload Challenge. Data Overload Challenge uses OpenAI Reasoning Model. OpenAI Reasoning Model led to Identified 18 Cases. Identified 18 Cases leads to AI Potential Shown uses led to Undiagnosed RareDiseases Data OverloadChallenge OpenAI ReasoningModel Identified 18Cases AI PotentialShown From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Undiagnosed Rare Diseases leads to Data Overload Challenge. Data Overload Challenge uses OpenAI Reasoning Model. OpenAI Reasoning Model led to Identified 18 Cases. Identified 18 Cases leads to AI Potential Shown uses led to Undiagnosed Rare Diseases many children with rare genetic diseasesremain undiagnosed Data Overload Challenge sifting through vast genetic data andliterature is difficult OpenAI Reasoning Model o3 Deep Research model used to re-examinecases Identified 18 Cases diagnoses identified in 18 previouslyunsolved rare genetic disease cases AI Potential Shown demonstrating AI's potential inre-analyzing complex medical data From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Undiagnosed Rare Diseases leads to Data Overload Challenge. Data Overload Challenge uses OpenAI Reasoning Model. OpenAI Reasoning Model led to Identified 18 Cases. Identified 18 Cases leads to AI Potential Shown uses led to Undiagnosed RareDiseases many children withrare geneticdiseases remain… Data OverloadChallenge sifting throughvast genetic dataand literature is… OpenAI ReasoningModel o3 Deep Researchmodel used tore-examine cases Identified 18Cases diagnosesidentified in 18previously unsolved… AI PotentialShown demonstrating AI'spotential inre-analyzing… From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Undiagnosed Rare Diseases leads to Data Overload Challenge. Data Overload Challenge uses OpenAI Reasoning Model. OpenAI Reasoning Model leads to Re-analyzing Complex Cases. Re-analyzing Complex Cases leads to Connecting Fragmented Data. OpenAI Reasoning Model led to Identified 18 Cases. Identified 18 Cases leads to AI Potential Shown uses led to Undiagnosed Rare Diseases many children with rare genetic diseasesremain undiagnosed Data Overload Challenge sifting through vast genetic data andliterature is difficult OpenAI Reasoning Model o3 Deep Research model used to re-examinecases Re-analyzing Complex Cases systematically revisit difficult caseswith AI Connecting Fragmented Data linking patient data, records, variants,and papers Identified 18 Cases diagnoses identified in 18 previouslyunsolved rare genetic disease cases AI Potential Shown demonstrating AI's potential inre-analyzing complex medical data From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Undiagnosed Rare Diseases leads to Data Overload Challenge. Data Overload Challenge uses OpenAI Reasoning Model. OpenAI Reasoning Model leads to Re-analyzing Complex Cases. Re-analyzing Complex Cases leads to Connecting Fragmented Data. OpenAI Reasoning Model led to Identified 18 Cases. Identified 18 Cases leads to AI Potential Shown uses led to Undiagnosed RareDiseases many children withrare geneticdiseases remain… Data OverloadChallenge sifting throughvast genetic dataand literature is… OpenAI ReasoningModel o3 Deep Researchmodel used tore-examine cases Re-analyzingComplex Cases systematicallyrevisit difficultcases with AI ConnectingFragmented Data linking patientdata, records,variants, and… Identified 18Cases diagnosesidentified in 18previously unsolved… AI PotentialShown demonstrating AI'spotential inre-analyzing… From startuphub.ai · The publishers behind this format

Unlocking Old Mysteries

The core challenge lies in connecting fragmented patient data, clinical records, genetic variants, and scientific papers, that are constantly being updated. As new gene-disease relationships are discovered, older, inconclusive cases can suddenly become interpretable. This study aimed to leverage AI to systematically revisit these difficult cases, acting as an 'explanation-first reasoning layer' on top of existing genomic pipelines.

The workflow involved feeding the model de-identified patient data, including standardized phenotype terms, clinical notes, and variant tables. The AI was tasked with proposing the most plausible molecular explanation, crucially showing its work. This allowed human experts to review the generated hypotheses using established clinical frameworks.

Tangible Results in Complex Cases

After expert review and clinical confirmation, diagnoses were established in 18 of the 376 cases, an additional diagnostic yield of 4.8%. This success rate, while seemingly modest, is significant given that these cases had already evaded extensive specialist analysis. This outcome highlights the potential of expert-led, AI-assisted reanalysis to scale as medical knowledge expands. The AI did not make diagnoses; it generated evidence-linked hypotheses for clinicians to investigate.

The AI demonstrated flexibility, even inferring a structural genomic event not initially present in the input data for one early-psychosis case, leading to a confirmed diagnosis of DiGeorge syndrome. In another instance, variants in two genes (LAMA2 and FOXP1) together better explained a patient's complex presentation, showcasing the model's ability to handle multi-gene explanations.

Hypothesizing Novel Mechanisms

Beyond direct diagnoses, the model also generated testable hypotheses for novel disease mechanisms. For a neurodevelopmental case with vitiligo, the AI highlighted a specific gene deletion (S1PR1) and proposed a link to altered signaling pathways affecting pigment production and immune cell presence in the skin. This demonstrates AI's capacity to translate scattered findings into concrete, experimentally verifiable theories, potentially accelerating our understanding of conditions like rare genetic diseases diagnosis.

The study also pointed to potential phenotype expansion for known genes, suggesting broader clinical spectra that warrant further investigation.

Limitations and Future Directions

Researchers emphasized that this study is retrospective and that the model's outputs require rigorous human adjudication and clinical confirmation. They also noted that broader clinical deployment will necessitate stringent privacy, security, and regulatory compliance. While this research used OpenAI's o3 Deep Research model, newer, more specialized AI systems are emerging for deeper life-sciences work, such as in genomic sequencing analysis.

Future work will involve prospective, multi-center studies to rigorously compare AI-assisted reanalysis with standard practices on diagnostic yield, clinician effort, and cost. The ultimate goal is to improve outcomes for patients with undiagnosed rare genetic conditions, building on prior efforts like advances in AI for rare disease identification.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.