Benjamin Cowen on Fine-Tuning AI Models with Modal

Benjamin Cowen from Modal discusses the shift towards custom, fine-tuned AI models and how serverless platforms simplify this process.

7 min read
Benjamin Cowen presenting on 'What Lies Beneath the API' at an AI Engineer Europe event.
AI Engineer

Benjamin Cowen, a Forward Deployed Machine Learning Engineer at Modal, recently presented on the topic of "What Lies Beneath the API," exploring the evolving landscape of AI model development and deployment. Cowen discussed the growing trend of companies fine-tuning their own models rather than solely relying on general-purpose APIs, and how serverless platforms are making this more accessible.

Benjamin Cowen on Fine-Tuning AI Models with Modal - AI Engineer
Benjamin Cowen on Fine-Tuning AI Models with Modal — from AI Engineer

Visual TL;DR. Frontier APIs leads to Model Spectrum. Scratch Servers leads to Model Spectrum. Domain-Specific Models leads to Need for Fine-Tuning. Need for Fine-Tuning requires Serverless Infrastructure. Serverless Infrastructure enables Accessible Custom AI. Model Spectrum shows Domain-Specific Models. Model Spectrum highlights Need for Fine-Tuning. Key Fine-Tuning Signals informs Need for Fine-Tuning.

  1. Frontier APIs: quick start, no infrastructure, powerful pre-trained models
  2. Scratch Servers: full control, precise fine-tuning for specific needs
  3. Model Spectrum: progression from general APIs to custom solutions
  4. Domain-Specific Models: growing trend for tailored AI performance
  5. Need for Fine-Tuning: customization unlocks better, predictable AI performance
  6. Serverless Infrastructure: simplifies AI training and inference processes
  7. Accessible Custom AI: making fine-tuning easier for companies
  8. Key Fine-Tuning Signals: identifying when custom models are beneficial
Visual TL;DR
Visual TL;DR — startuphub.ai Domain-Specific Models leads to Need for Fine-Tuning. Need for Fine-Tuning requires Serverless Infrastructure. Serverless Infrastructure enables Accessible Custom AI requires enables Frontier APIs Scratch Servers Domain-Specific Models Need for Fine-Tuning Serverless Infrastructure Accessible Custom AI From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Domain-Specific Models leads to Need for Fine-Tuning. Need for Fine-Tuning requires Serverless Infrastructure. Serverless Infrastructure enables Accessible Custom AI requires enables Frontier APIs Scratch Servers Domain-SpecificModels Need forFine-Tuning ServerlessInfrastructure Accessible CustomAI From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Domain-Specific Models leads to Need for Fine-Tuning. Need for Fine-Tuning requires Serverless Infrastructure. Serverless Infrastructure enables Accessible Custom AI requires enables Frontier APIs quick start, no infrastructure, powerfulpre-trained models Scratch Servers full control, precise fine-tuning forspecific needs Domain-Specific Models growing trend for tailored AI performance Need for Fine-Tuning customization unlocks better, predictableAI performance Serverless Infrastructure simplifies AI training and inferenceprocesses Accessible Custom AI making fine-tuning easier for companies From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Domain-Specific Models leads to Need for Fine-Tuning. Need for Fine-Tuning requires Serverless Infrastructure. Serverless Infrastructure enables Accessible Custom AI requires enables Frontier APIs quick start, noinfrastructure,powerful… Scratch Servers full control,precise fine-tuningfor specific needs Domain-SpecificModels growing trend fortailored AIperformance Need forFine-Tuning customizationunlocks better,predictable AI… ServerlessInfrastructure simplifies AItraining andinference processes Accessible CustomAI making fine-tuningeasier forcompanies From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Frontier APIs leads to Model Spectrum. Scratch Servers leads to Model Spectrum. Domain-Specific Models leads to Need for Fine-Tuning. Need for Fine-Tuning requires Serverless Infrastructure. Serverless Infrastructure enables Accessible Custom AI. Model Spectrum shows Domain-Specific Models. Model Spectrum highlights Need for Fine-Tuning. Key Fine-Tuning Signals informs Need for Fine-Tuning requires enables shows highlights informs Frontier APIs quick start, no infrastructure, powerfulpre-trained models Scratch Servers full control, precise fine-tuning forspecific needs Model Spectrum progression from general APIs to customsolutions Domain-Specific Models growing trend for tailored AI performance Need for Fine-Tuning customization unlocks better, predictableAI performance Serverless Infrastructure simplifies AI training and inferenceprocesses Accessible Custom AI making fine-tuning easier for companies Key Fine-Tuning Signals identifying when custom models arebeneficial From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Frontier APIs leads to Model Spectrum. Scratch Servers leads to Model Spectrum. Domain-Specific Models leads to Need for Fine-Tuning. Need for Fine-Tuning requires Serverless Infrastructure. Serverless Infrastructure enables Accessible Custom AI. Model Spectrum shows Domain-Specific Models. Model Spectrum highlights Need for Fine-Tuning. Key Fine-Tuning Signals informs Need for Fine-Tuning requires enables shows highlights informs Frontier APIs quick start, noinfrastructure,powerful… Scratch Servers full control,precise fine-tuningfor specific needs Model Spectrum progression fromgeneral APIs tocustom solutions Domain-SpecificModels growing trend fortailored AIperformance Need forFine-Tuning customizationunlocks better,predictable AI… ServerlessInfrastructure simplifies AItraining andinference processes Accessible CustomAI making fine-tuningeasier forcompanies Key Fine-TuningSignals identifying whencustom models arebeneficial From startuphub.ai · The publishers behind this format

The Model Spectrum: From Frontier API to Custom Solutions

Cowen introduced the concept of the "Model Spectrum," illustrating a progression from using readily available "Frontier APIs" to building and managing models on "Scratch Servers." Frontier APIs offer a quick start with no infrastructure overhead and access to powerful, pre-trained models. However, they lack customization and can sometimes yield unpredictable performance.

Related startups

On the other end of the spectrum, Scratch Servers provide full control and the ability to fine-tune models precisely to specific needs. This approach offers maximum customization and allows for the definition of custom metrics. The trade-off is the significant burden of infrastructure management, including cluster management and self-maintenance of software stacks.

The Rise of Domain-Specific Models and the Need for Fine-Tuning

Cowen highlighted that as companies mature, they increasingly need to fine-tune models on proprietary data to achieve better performance, lower latency, and custom functionality. He cited examples like Intercom's Fin Apex, which reportedly beat GPT-5.4 at 1/5th the cost, and Pinterest CEO Ben Silbermann's statement about achieving "orders of magnitude reduction in cost" by fine-tuning open-source models versus using frontier APIs.

This trend signifies a shift in how AI is viewed: models are becoming raw materials, and the fine-tuned, domain-specific system is the actual product. Cowen emphasized that this fine-tuning process is becoming more accessible.

Serverless Infrastructure for AI Training and Inference

The presentation showcased how serverless platforms like Modal are bridging the gap between ease of use and control. Cowen explained that Modal's infrastructure, which includes unified GPUs and sandboxed environments, makes large-scale AI training and inference feasible with significantly less code and management overhead.

He demonstrated that fine-tuning models, such as those for large language models (LLMs) or reinforcement learning (RL) tasks, can be achieved with surprisingly concise codebases, often in as little as 300 lines of Python. This is facilitated by open-source libraries and serverless infrastructure that handles parallel hyperparameter sweeps and scaling.

Cowen provided code examples illustrating how to set up fine-tuning jobs and deploy models efficiently. He noted that the ability to scale containers on demand and the abstraction of infrastructure management are key benefits of using such platforms. This allows developers to focus on model development and data curation rather than infrastructure plumbing.

Key Signals for Fine-Tuning

Cowen also outlined several signals that indicate it might be time for a product to transition to a fine-tuned, domain-specific model:

  • Evaluations are plateauing despite prompt work.
  • There is a need for lower latency or higher throughput.
  • Unit economics are not scaling effectively.
  • Core functionality is still developing.
  • There's a lack of collected, relevant data for prompt engineering.

He concluded by emphasizing that if a product has already involved agent harnessing, evaluation suites, AI engineers, and data collection, the hard part of building a domain-specific model may already be done, making the transition to fine-tuning a logical next step.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.