Ben Kunkle on Building Zed's Zeta2 Prediction Model

Ben Kunkle from Zed Industries explains the architecture and data pipeline for building Zeta2, an AI model that predicts code edits.

7 min read
Ben Kunkle presenting on stage about building an AI model
Ben Kunkle discusses the creation of Zeta2 at an AI Engineer event.· AI Engineer

Ben Kunkle, Lead at Zed Industries, detailed the process of building Zeta2, an AI model designed to predict a user's next edit as they type. In his presentation, Kunkle explained the technical pipeline and data considerations involved in training such a model, emphasizing the challenges and solutions encountered in production environments.

Ben Kunkle on Building Zed's Zeta2 Prediction Model - AI Engineer
Ben Kunkle on Building Zed's Zeta2 Prediction Model — from AI Engineer

Visual TL;DR. Predicting Code Edits leads to Zeta2 Model. Ultra-Low Latency requires Zeta2 Model. Training Pipeline trains Zeta2 Model. Data Considerations informs Training Pipeline. Teacher Frontier Model uses Training Pipeline. Offline Evaluation leads to Production Monitoring. Zeta2 Model enables Faster Coding.

Related startups

  1. Predicting Code Edits: AI model predicts user's next code edit as they type
  2. Zeta2 Model: Specialized, small AI model for fast keystroke prediction
  3. Ultra-Low Latency: Must operate under 300ms per keystroke for real-time use
  4. Training Pipeline: Ingests production and synthetic data for model training
  5. Data Considerations: Focus on 'settled data' and production vs. synthetic sources
  6. Teacher Frontier Model: Generates training data for the Zeta2 prediction model
  7. Offline Evaluation: Assessing model performance before production deployment
  8. Production Monitoring: Continuous tracking of model performance in live environment
  9. Faster Coding: Enables quicker and more efficient code writing for users
Visual TL;DR
Visual TL;DR — startuphub.ai Predicting Code Edits leads to Zeta2 Model. Ultra-Low Latency requires Zeta2 Model. Training Pipeline trains Zeta2 Model. Zeta2 Model enables Faster Coding requires trains enables Predicting Code Edits Zeta2 Model Ultra-Low Latency Training Pipeline Faster Coding From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Predicting Code Edits leads to Zeta2 Model. Ultra-Low Latency requires Zeta2 Model. Training Pipeline trains Zeta2 Model. Zeta2 Model enables Faster Coding requires trains enables Predicting CodeEdits Zeta2 Model Ultra-Low Latency Training Pipeline Faster Coding From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Predicting Code Edits leads to Zeta2 Model. Ultra-Low Latency requires Zeta2 Model. Training Pipeline trains Zeta2 Model. Zeta2 Model enables Faster Coding requires trains enables Predicting Code Edits AI model predicts user's next code edit asthey type Zeta2 Model Specialized, small AI model for fastkeystroke prediction Ultra-Low Latency Must operate under 300ms per keystroke forreal-time use Training Pipeline Ingests production and synthetic data formodel training Faster Coding Enables quicker and more efficient codewriting for users From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Predicting Code Edits leads to Zeta2 Model. Ultra-Low Latency requires Zeta2 Model. Training Pipeline trains Zeta2 Model. Zeta2 Model enables Faster Coding requires trains enables Predicting CodeEdits AI model predictsuser's next codeedit as they type Zeta2 Model Specialized, smallAI model for fastkeystroke… Ultra-Low Latency Must operate under300ms per keystrokefor real-time use Training Pipeline Ingests productionand synthetic datafor model training Faster Coding Enables quicker andmore efficient codewriting for users From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Predicting Code Edits leads to Zeta2 Model. Ultra-Low Latency requires Zeta2 Model. Training Pipeline trains Zeta2 Model. Data Considerations informs Training Pipeline. Teacher Frontier Model uses Training Pipeline. Offline Evaluation leads to Production Monitoring. Zeta2 Model enables Faster Coding requires trains informs uses leads to enables Predicting Code Edits AI model predicts user's next code edit asthey type Zeta2 Model Specialized, small AI model for fastkeystroke prediction Ultra-Low Latency Must operate under 300ms per keystroke forreal-time use Training Pipeline Ingests production and synthetic data formodel training Data Considerations Focus on 'settled data' and production vs.synthetic sources Teacher Frontier Model Generates training data for the Zeta2prediction model Offline Evaluation Assessing model performance beforeproduction deployment Production Monitoring Continuous tracking of model performancein live environment Faster Coding Enables quicker and more efficient codewriting for users From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Predicting Code Edits leads to Zeta2 Model. Ultra-Low Latency requires Zeta2 Model. Training Pipeline trains Zeta2 Model. Data Considerations informs Training Pipeline. Teacher Frontier Model uses Training Pipeline. Offline Evaluation leads to Production Monitoring. Zeta2 Model enables Faster Coding requires trains informs uses leads to enables Predicting CodeEdits AI model predictsuser's next codeedit as they type Zeta2 Model Specialized, smallAI model for fastkeystroke… Ultra-Low Latency Must operate under300ms per keystrokefor real-time use Training Pipeline Ingests productionand synthetic datafor model training DataConsiderations Focus on 'settleddata' andproduction vs.… Teacher FrontierModel Generates trainingdata for the Zeta2prediction model OfflineEvaluation Assessing modelperformance beforeproduction… ProductionMonitoring Continuous trackingof modelperformance in live… Faster Coding Enables quicker andmore efficient codewriting for users From startuphub.ai · The publishers behind this format

Understanding Edit Prediction

Kunkle began by defining edit prediction as the task of providing the model with context around a user's cursor and recent edits, along with type or variable definitions and any diagnostics or errors, to predict the subsequent edit. This process must be extremely fast, operating on every keystroke with a latency budget under 300 milliseconds, necessitating a small, specialized model.

The Training Pipeline

The core of the training process involves a pipeline that ingests both 'production data' (snapshots of user activity) and 'synthetic data' (git commits). This data is fed into a 'teacher frontier' model, which generates predictions. These predictions are then evaluated, and any that fail are sent to a 'repair' stage, where a teacher model attempts to correct them. The corrected data is then fed back into the distillation process to train the student model. Kunkle highlighted that each stage in this pipeline enriches the data, converting JSONL inputs into enriched 'examples' and outputting JSONL, which is crucial for managing large datasets efficiently across experiments.

Data Considerations and 'Settled Data'

A significant challenge in training edit prediction models is the inherent noisiness of the data. Kunkle explained that to address this, they use a concept called 'settled data'. This involves waiting for the prediction region to stabilize and then capturing the final state of the code as the 'answer'. By comparing the model's predictions against this 'settled state', they can filter out noisy examples and identify high-quality training data. This method allows for training on ideal examples where matches between predictions and the final code are clear and unambiguous.

Offline Evaluation and Production Monitoring

For offline evaluation, Kunkle mentioned metrics such as 'deltaChrF' (character F-score), exact lines matched, reversal ratio, and kept rate. These metrics are used to assess the model's performance on a held-out test set. He also touched upon the importance of tracking model performance in production after deployment. This includes using structured logs for latency, kept rate, and token counts, as well as dashboards that monitor acceptance rates and A/B test results across different model versions. The goal is to continuously monitor and improve the model's effectiveness in real-world usage.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.