Sarah Chieng: Fast Models Need "Slow" Developers

Name: Sarah Chieng: Fast Models Need "Slow" Developers
Uploaded: 2026-05-22T19:02:36.471Z
Description: Cerebras' Sarah Chieng discusses how fast AI coding models like Codex Spark necessitate new developer habits and workflows for optimal results.

Cerebras' Sarah Chieng discusses how fast AI coding models like Codex Spark necessitate new developer habits and workflows for optimal results.

May 22 at 7:02 PM8 min read

Sarah Chieng on stage presenting "Fast Models Need Slow Developers" — AI Engineer

Visual TL;DR. AI Coding Speed Revolution leads to Codex Spark Speed. Codex Spark Speed exposes Existing Developer Habits. Existing Developer Habits necessitates Rethink Developer Workflows. Optimized Inference Stack enables Codex Spark Speed. Sarah Chieng's Insight discusses AI Coding Speed Revolution. Rethink Developer Workflows leads to Optimal AI Results. New Developer Habits enables Optimal AI Results.

AI Coding Speed Revolution: models like Codex Spark generate code incredibly fast
Codex Spark Speed: 1,200 tokens per second, a massive leap
Existing Developer Habits: current workflows are not optimized for this speed
Rethink Developer Workflows: need new habits for optimal AI assistant interaction
Optimized Inference Stack: hardware and model architecture advancements enable speed
Sarah Chieng's Insight: Head of Developer Experience at Cerebras
New Developer Habits: adapting to faster AI code generation
Optimal AI Results: achieving the full potential of fast models

Visual TL;DRQuickExplainDeeper

In the rapidly evolving world of AI-powered coding, the emergence of exceptionally fast models like Codex Spark presents both opportunities and challenges for developers. Sarah Chieng, Head of Developer Experience at Cerebras, delivered a compelling talk titled "Fast Models Need Slow Developers" at AI Engineer Europe, highlighting how these advancements necessitate a fundamental shift in how developers interact with AI coding assistants.

Sarah Chieng: Fast Models Need "Slow" Developers - AI Engineer — Sarah Chieng: Fast Models Need "Slow" Developers — from AI Engineer

The Speed Revolution in AI Coding

Chieng introduced Codex Spark, a model capable of generating code at an astonishing 1,200 tokens per second, a significant leap from the 40-60 tokens per second seen in models like the Sonnet family or GPT-4. This dramatic increase in speed, she explained, is a result of optimizations across the entire AI inference stack, including hardware advancements and novel model architectures.

However, this speed also exposes the limitations of existing developer habits. "A lot of these bad habits that we had before that we're generating maybe 50 tokens per second of bad code," Chieng stated, "Unless we fix them, they're going to start generating 1,200 tokens per second of bad code." The core message is that simply having faster models isn't enough; developers must also adapt their approach to harness this power effectively.

Rethinking Developer Workflows for Speed

Chieng outlined several key strategies for developers to navigate this new era:

Leverage Different Models for Different Tasks: For complex planning or long-horizon workflows, using a more capable, albeit slower, model like GPT-4.5 is recommended. For the actual execution and rapid iteration, faster models like Codex Spark are ideal. This allows developers to "spawn all of your sub-agents with Codex Spark and have it actually execute on all of the steps."
Create Reusable Skills: When a developer achieves a good result with an AI agent for a specific task, they should capture that trajectory and turn it into a reusable skill. This transforms successful, albeit time-consuming, interactions into repeatable workflows that can be run in the background.
Optimize Context Management: With faster models, managing the context window becomes crucial. Chieng advised breaking down large tasks into smaller, bounded goals and maintaining a persistent external memory system (like agents.md, plan.md, progress.md, verify.md) to keep track of progress and inform future steps. Developers should aim to avoid filling the context window completely, as this can lead to information loss and degraded model performance.
Refactor and Validate Freely: The speed of models like Codex Spark makes validation and refactoring almost instantaneous. Developers can afford to ask the AI to perform these tasks more frequently, ensuring cleaner code and a better understanding of the project's progression.
Explore Cherry Picking: For tasks where quantity or variety is valued, such as generating different UI styles or exploring research directions, developers can prompt the AI to produce multiple variations and then cherry-pick the best results, significantly accelerating the creative and iterative process.

Chieng emphasized that the true benefit of these advancements lies not just in raw speed, but in the improved developer experience. "What it really means is that the developer experience is actually going to become so much better," she asserted. "And when it's becoming so much better, there's so much more we can do."

The talk concluded with a list of useful commands for interacting with Codex Spark, including `/permissions`, `/experimental`, `/skills`, `/review`, `/rename`, `/new`, `/resume`, and `/fork`, underscoring the importance of precise control and interaction with these powerful new tools.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

#Sarah Chieng #Cerebras #AI Coding #Codex Spark #LLMs #Developer Experience #AI Engineer Europe