Biology's Multimodal AI Unifier

MIMIC foundation model unifies biological data modalities, enabling advanced prediction and constrained molecular design.

Diagram illustrating the multimodal integration in the MIMIC foundation model.
The MIMIC foundation model integrates multiple biological data modalities.

Biological systems operate under intricate, coupled constraints spanning sequence, structure, regulation, evolution, and cellular context. Existing foundation models in biology, however, often operate in silos, focusing on single modalities or fixed forward tasks. This fragmentation limits their ability to capture the holistic nature of biological function.

Bridging Modalities with MIMIC

The researchers introduce MIMIC, a generative multimodal foundation model designed to overcome these limitations. Trained on the newly curated and aligned LORE dataset, MIMIC integrates nucleic acid, protein, evolutionary, structural, regulatory, and semantic/contextual data. Its split-track encoder-decoder architecture is a crucial innovation, enabling it to condition on arbitrary subsets of observed modalities and reconstruct or generate missing components of molecular states across the genome, transcriptome, and proteome.

Related startups

From Prediction to Design: A Generative Framework

MIMIC demonstrates significant performance gains, consistently outperforming sequence-only models in sequence reconstruction through multimodal conditioning. Its learned representations achieve state-of-the-art results on downstream RNA and protein tasks, including splicing prediction. Crucially, the joint generative formulation extends beyond prediction to enable constrained molecular design. For RNA, MIMIC can identify corrective edits for a splice-disrupting mutation by leveraging evolutionary and structural signals. In protein design, it jointly conditions on shape and surface chemistry to generate high-confidence sequences with strong predicted binding affinity. Furthermore, MIMIC’s ability to use experimental context as semantic conditioning, rather than a fixed output, allows for modeling assay-dependent phenomena like RNA chemical probing.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.