For decades, geneticists have focused on the two percent of the human genome that codes for proteins, often dismissing the remaining 98 percent as "junk DNA." That era is officially ending.
ARC Innovation at Sheba Medical Center and the Icahn School of Medicine at Mount Sinai have announced a landmark three-year collaboration with NVIDIA to build what they are calling a Genomic Large Language Model (Genomic LLM), or Genomic Foundation Model (gFM). The goal is ambitious: to use advanced AI to finally decipher the vast, poorly understood regulatory sequences that govern human health and disease.
This isn't just another research grant; it’s a massive computational undertaking that treats DNA sequences like text. Just as GPT models learn grammar and context from billions of words, this Genomic LLM will learn the biological "language" of the genome from extensive clinical and genomic datasets provided by the medical institutions.
The partnership unites three global heavyweights: Sheba and Mount Sinai bring the deep clinical insight, patient data, and genomic expertise, while NVIDIA provides the necessary computational architecture, AI development platforms, and software—the full-stack AI platform required to train a model of this scale.
The stakes are enormous. As Prof. Gidi Rechavi, Head of the Sheba Cancer Research Center, noted, the 98 percent of the genome previously ignored is now recognized as containing critical regulatory and functional elements. Decoding this variability is the key to next-generation diagnosis and therapy.
The AI Engine for Precision Medicine
The initiative is anchored in Mount Sinai’s Million Health Discoveries Program and Sheba’s ARC Innovation hub, creating a unified engine for accelerating breakthroughs. The initial focus will be on complex diseases where traditional genetic analysis has stalled, requiring researchers to analyze the interplay of thousands of genomic regions simultaneously.
“By bringing advanced AI into genomic research, we’re moving closer to making personalized, precision medicine a reality for all,” said Dr. Alexander Charney, Director of the Charles Bronfman Institute for Personalized Medicine at Mount Sinai.
The fundamental challenge in genomics is pattern recognition at an unprecedented scale. Traditional methods struggle to identify subtle regulatory mechanisms that link genetic variation to disease risk and therapeutic response across millions of data points. A Genomic LLM, however, is designed specifically for this kind of high-dimensional pattern identification.
NVIDIA’s role is critical. Training a foundational model on the human genome requires immense GPU power and specialized software. Dr. Nati Daniel and Dr. Yoli Shavit, Applied AI Architecture at NVIDIA, emphasized that the development of this state-of-the-art gFM brings together clinicians, geneticists, bioinformaticians, and AI researchers to tackle one of science’s greatest challenges.
If successful, this collaboration will not just produce research papers; it will create a foundational tool—a biological GPT—that can be leveraged globally. This Genomic LLM could fundamentally transform drug discovery, allowing researchers to rapidly identify therapeutic targets and predict patient response with far greater accuracy than current methods allow.
The partnership is a clear signal that the future of precision medicine hinges on treating biology as an information science problem solvable by large-scale AI. As Prof. Eyal Zimlichman, Director of ARC at Sheba, stated, "Only by combining the unique strengths of the three partnering organizations, can we solve one of the toughest challenges in healthcare that touches at the very core of how the human body works." This project aims to accelerate scientific discovery, strengthen health systems, and drive new economic value throughout the healthcare ecosystem worldwide.



