The Bitter Lesson: AI's Protein Problem

Alex Rives of Biohub explains how AI language models are learning the 'grammar' of protein biology, enabling the design of new proteins and therapeutics.

7 min read
Screenshots from the video showing code and protein structures, and the speakers.
Latent Space

Alex Rives, Head of Science at Biohub, discussed the profound impact of AI on protein biology, drawing parallels to the "bitter lesson" observed in other AI domains. In a conversation with Brandon Anderson, Staff Scientist at Atomic AI, Rives highlighted how large language models, when trained on vast datasets of protein sequences, can learn fundamental biological principles. This capability is paving the way for a new era of programmable biology, where AI can predict protein structures and functions, and even design novel proteins with desired therapeutic properties.

The Bitter Lesson: AI's Protein Problem - Latent Space
The Bitter Lesson: AI's Protein Problem — from Latent Space

Visual TL;DR. AI's Protein Problem applies The Bitter Lesson. The Bitter Lesson uses Large Language Models. Large Language Models enables Predict Protein Structures. Predict Protein Structures and Design Novel Proteins. Design Novel Proteins leads to Programmable Biology. Design Novel Proteins for Therapeutic Properties.

Related startups

  1. AI's Protein Problem: AI models learning the 'grammar' of protein biology
  2. The Bitter Lesson: learning from data without explicit programming
  3. Large Language Models: trained on vast datasets of protein sequences
  4. Predict Protein Structures: uncovering implicit rules governing protein folding and function
  5. Design Novel Proteins: enabling the creation of new proteins with desired properties
  6. Programmable Biology: a new era for designing biological systems
  7. Therapeutic Properties: designing proteins for medical applications
Visual TL;DR
Visual TL;DR — startuphub.ai AI's Protein Problem applies The Bitter Lesson. The Bitter Lesson uses Large Language Models. Large Language Models enables Predict Protein Structures applies uses enables AI's Protein Problem The Bitter Lesson Large Language Models Predict Protein Structures Programmable Biology From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI's Protein Problem applies The Bitter Lesson. The Bitter Lesson uses Large Language Models. Large Language Models enables Predict Protein Structures applies uses enables AI's ProteinProblem The Bitter Lesson Large LanguageModels Predict ProteinStructures ProgrammableBiology From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI's Protein Problem applies The Bitter Lesson. The Bitter Lesson uses Large Language Models. Large Language Models enables Predict Protein Structures applies uses enables AI's Protein Problem AI models learning the 'grammar' ofprotein biology The Bitter Lesson learning from data without explicitprogramming Large Language Models trained on vast datasets of proteinsequences Predict Protein Structures uncovering implicit rules governingprotein folding and function Programmable Biology a new era for designing biological systems From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI's Protein Problem applies The Bitter Lesson. The Bitter Lesson uses Large Language Models. Large Language Models enables Predict Protein Structures applies uses enables AI's ProteinProblem AI models learningthe 'grammar' ofprotein biology The Bitter Lesson learning from datawithout explicitprogramming Large LanguageModels trained on vastdatasets of proteinsequences Predict ProteinStructures uncovering implicitrules governingprotein folding and… ProgrammableBiology a new era fordesigningbiological systems From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI's Protein Problem applies The Bitter Lesson. The Bitter Lesson uses Large Language Models. Large Language Models enables Predict Protein Structures. Predict Protein Structures and Design Novel Proteins. Design Novel Proteins leads to Programmable Biology. Design Novel Proteins for Therapeutic Properties applies uses enables and leads to for AI's Protein Problem AI models learning the 'grammar' ofprotein biology The Bitter Lesson learning from data without explicitprogramming Large Language Models trained on vast datasets of proteinsequences Predict Protein Structures uncovering implicit rules governingprotein folding and function Design Novel Proteins enabling the creation of new proteins withdesired properties Programmable Biology a new era for designing biological systems Therapeutic Properties designing proteins for medicalapplications From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI's Protein Problem applies The Bitter Lesson. The Bitter Lesson uses Large Language Models. Large Language Models enables Predict Protein Structures. Predict Protein Structures and Design Novel Proteins. Design Novel Proteins leads to Programmable Biology. Design Novel Proteins for Therapeutic Properties applies uses enables and leads to for AI's ProteinProblem AI models learningthe 'grammar' ofprotein biology The Bitter Lesson learning from datawithout explicitprogramming Large LanguageModels trained on vastdatasets of proteinsequences Predict ProteinStructures uncovering implicitrules governingprotein folding and… Design NovelProteins enabling thecreation of newproteins with… ProgrammableBiology a new era fordesigningbiological systems TherapeuticProperties designing proteinsfor medicalapplications From startuphub.ai · The publishers behind this format

The Bitter Lesson and Protein Biology

Rives explained that the core idea behind applying AI to protein biology is rooted in the concept of learning from data without explicit programming. Just as AI models have learned to excel at tasks like language translation or image recognition by processing massive amounts of data, they can similarly uncover the implicit rules governing protein folding, function, and interactions. This approach, he noted, is a testament to the "bitter lesson" – the idea that scaling computation and data often leads to more general and powerful AI capabilities than relying on handcrafted features or domain-specific heuristics.

From Language Models to Protein Models

The conversation touched upon the evolution of AI models, moving from natural language processing to biological sequences. Rives detailed how models like ESM (Evolutionary Scale Modeling) are trained to predict the next token in a sequence, a process that, when applied to proteins, allows them to learn the underlying grammar of protein biology. He showcased how these models can generate representations that capture complex biological information, such as evolutionary constraints and functional motifs. This learned representation, he argued, is akin to a "world model" of protein biology, enabling a deeper understanding and manipulation of these complex molecules.

Designing Proteins with AI

A key takeaway from Rives's discussion was the potential for AI to move beyond prediction and into generative design. By leveraging the insights gained from these models, researchers can now search these learned protein "worlds" to find sequences that satisfy specific design criteria, such as binding to a particular target molecule or exhibiting a desired structural property. He highlighted the successful design of novel protein binders, exemplified by the creation of mini-protein binders that could target specific proteins like EGFR or CTLA-4, demonstrating the tangible impact of this AI-driven approach.

The Importance of Scale and Data

Rives emphasized the critical role of scale and data availability in achieving these breakthroughs. The development of large-scale protein language models, trained on massive datasets of protein sequences and structures, has been instrumental in uncovering these emergent capabilities. He pointed to the exponential growth in the size and complexity of these models over the years, noting that as models scale, they not only improve in performance but also begin to capture more nuanced biological principles. This scaling trend, he suggested, is likely to continue, leading to even more profound discoveries in the future.

Implications for the Future of Biology

The implications of this work extend far beyond basic research. Rives envisioned a future where AI-powered protein design could accelerate drug discovery, enable the creation of novel enzymes for industrial applications, and even contribute to sustainable biotechnology solutions. The ability to precisely engineer proteins with desired functions, guided by AI, opens up vast possibilities for tackling some of the world's most pressing challenges in health, environment, and energy.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.