Artificial Intelligence

Preferred on Google

Ben Kunkle on Building Zed's Zeta2 Prediction Model

Ben Kunkle from Zed Industries explains the architecture and data pipeline for building Zeta2, an AI model that predicts code edits.

May 30 at 5:01 PM7 min read

Ben Kunkle presenting on stage about building an AI model — Ben Kunkle discusses the creation of Zeta2 at an AI Engineer event.· AI Engineer

Visual TL;DR. Predicting Code Edits leads to Zeta2 Model. Ultra-Low Latency requires Zeta2 Model. Training Pipeline trains Zeta2 Model. Data Considerations informs Training Pipeline. Teacher Frontier Model uses Training Pipeline. Offline Evaluation leads to Production Monitoring. Zeta2 Model enables Faster Coding.

Predicting Code Edits: AI model predicts user's next code edit as they type
Zeta2 Model: Specialized, small AI model for fast keystroke prediction
Ultra-Low Latency: Must operate under 300ms per keystroke for real-time use
Training Pipeline: Ingests production and synthetic data for model training
Data Considerations: Focus on 'settled data' and production vs. synthetic sources
Teacher Frontier Model: Generates training data for the Zeta2 prediction model
Offline Evaluation: Assessing model performance before production deployment
Production Monitoring: Continuous tracking of model performance in live environment
Faster Coding: Enables quicker and more efficient code writing for users

Visual TL;DRQuickExplainDeeper

Ben Kunkle, Lead at Zed Industries, detailed the process of building Zeta2, an AI model designed to predict a user's next edit as they type. In his presentation, Kunkle explained the technical pipeline and data considerations involved in training such a model, emphasizing the challenges and solutions encountered in production environments.

Ben Kunkle on Building Zed's Zeta2 Prediction Model - AI Engineer — Ben Kunkle on Building Zed's Zeta2 Prediction Model — from AI Engineer

Understanding Edit Prediction

Kunkle began by defining edit prediction as the task of providing the model with context around a user's cursor and recent edits, along with type or variable definitions and any diagnostics or errors, to predict the subsequent edit. This process must be extremely fast, operating on every keystroke with a latency budget under 300 milliseconds, necessitating a small, specialized model.

The Training Pipeline

The core of the training process involves a pipeline that ingests both 'production data' (snapshots of user activity) and 'synthetic data' (git commits). This data is fed into a 'teacher frontier' model, which generates predictions. These predictions are then evaluated, and any that fail are sent to a 'repair' stage, where a teacher model attempts to correct them. The corrected data is then fed back into the distillation process to train the student model. Kunkle highlighted that each stage in this pipeline enriches the data, converting JSONL inputs into enriched 'examples' and outputting JSONL, which is crucial for managing large datasets efficiently across experiments.

Data Considerations and 'Settled Data'

A significant challenge in training edit prediction models is the inherent noisiness of the data. Kunkle explained that to address this, they use a concept called 'settled data'. This involves waiting for the prediction region to stabilize and then capturing the final state of the code as the 'answer'. By comparing the model's predictions against this 'settled state', they can filter out noisy examples and identify high-quality training data. This method allows for training on ideal examples where matches between predictions and the final code are clear and unambiguous.

Offline Evaluation and Production Monitoring

For offline evaluation, Kunkle mentioned metrics such as 'deltaChrF' (character F-score), exact lines matched, reversal ratio, and kept rate. These metrics are used to assess the model's performance on a held-out test set. He also touched upon the importance of tracking model performance in production after deployment. This includes using structured logs for latency, kept rate, and token counts, as well as dashboards that monitor acceptance rates and A/B test results across different model versions. The goal is to continuously monitor and improve the model's effectiveness in real-world usage.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

#Ben Kunkle #Zed Industries #AI Engineer #Machine Learning #Edit Prediction #Code Completion #Model Training #Distillation #Data Pipeline #AI Research