Hugging Face's Ben Burtenshaw on AI System Engineering

Ben Burtenshaw from Hugging Face discusses how AI coding agents can be used for AI system engineering, kernel optimization, and building multi-agent autoresearch labs.

8 min read
Ben Burtenshaw presenting on AI System Engineering
AI Engineer

Ben Burtenshaw from Hugging Face recently presented on the potential of AI agents in system engineering, arguing that coding agents should be leveraged for these complex tasks. In his talk, Burtenshaw highlighted how AI agents are becoming increasingly capable, moving beyond simple code generation to more sophisticated system-level engineering.

Hugging Face's Ben Burtenshaw on AI System Engineering - AI Engineer
Hugging Face's Ben Burtenshaw on AI System Engineering — from AI Engineer

Visual TL;DR. AI Agents for Engineering enable System Engineering Tasks. System Engineering Tasks involves Custom Kernels. Custom Kernels leads to Performance Optimization. System Engineering Tasks requires Agent Benchmarking. AI Agents for Engineering builds Multi-Agent Labs. Multi-Agent Labs creates Autoresearch Labs. System Engineering Tasks advances AI System Engineering. Autoresearch Labs enhances AI System Engineering.

  1. AI Agents for Engineering: coding agents evolving beyond simple code generation
  2. System Engineering Tasks: tackling intricate engineering challenges, discovering APIs, connecting systems
  3. Custom Kernels: optimizing performance with specialized code for specific hardware
  4. Performance Optimization: achieving faster execution through tailored kernel development
  5. Agent Benchmarking: measuring and comparing AI agent capabilities and performance
  6. Multi-Agent Labs: building autoresearch labs with interconnected AI agents
  7. Autoresearch Labs: enabling AI agents to conduct research and development autonomously
  8. AI System Engineering: leveraging AI agents for complex system design and implementation
Visual TL;DR
Visual TL;DR — startuphub.ai AI Agents for Engineering enable System Engineering Tasks. AI Agents for Engineering builds Multi-Agent Labs. Multi-Agent Labs creates Autoresearch Labs. System Engineering Tasks advances AI System Engineering. Autoresearch Labs enhances AI System Engineering enable builds creates advances enhances AI Agents for Engineering System Engineering Tasks Performance Optimization Multi-Agent Labs Autoresearch Labs AI System Engineering From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI Agents for Engineering enable System Engineering Tasks. AI Agents for Engineering builds Multi-Agent Labs. Multi-Agent Labs creates Autoresearch Labs. System Engineering Tasks advances AI System Engineering. Autoresearch Labs enhances AI System Engineering enable builds creates advances enhances AI Agents forEngineering SystemEngineering Tasks PerformanceOptimization Multi-Agent Labs Autoresearch Labs AI SystemEngineering From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI Agents for Engineering enable System Engineering Tasks. AI Agents for Engineering builds Multi-Agent Labs. Multi-Agent Labs creates Autoresearch Labs. System Engineering Tasks advances AI System Engineering. Autoresearch Labs enhances AI System Engineering enable builds creates advances enhances AI Agents for Engineering coding agents evolving beyond simple codegeneration System Engineering Tasks tackling intricate engineering challenges,discovering APIs, connecting systems Performance Optimization achieving faster execution throughtailored kernel development Multi-Agent Labs building autoresearch labs withinterconnected AI agents Autoresearch Labs enabling AI agents to conduct research anddevelopment autonomously AI System Engineering leveraging AI agents for complex systemdesign and implementation From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI Agents for Engineering enable System Engineering Tasks. AI Agents for Engineering builds Multi-Agent Labs. Multi-Agent Labs creates Autoresearch Labs. System Engineering Tasks advances AI System Engineering. Autoresearch Labs enhances AI System Engineering enable builds creates advances enhances AI Agents forEngineering coding agentsevolving beyondsimple code… SystemEngineering Tasks tackling intricateengineeringchallenges,… PerformanceOptimization achieving fasterexecution throughtailored kernel… Multi-Agent Labs buildingautoresearch labswith interconnected… Autoresearch Labs enabling AI agentsto conduct researchand development… AI SystemEngineering leveraging AIagents for complexsystem design and… From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI Agents for Engineering enable System Engineering Tasks. System Engineering Tasks involves Custom Kernels. Custom Kernels leads to Performance Optimization. System Engineering Tasks requires Agent Benchmarking. AI Agents for Engineering builds Multi-Agent Labs. Multi-Agent Labs creates Autoresearch Labs. System Engineering Tasks advances AI System Engineering. Autoresearch Labs enhances AI System Engineering enable involves leads to requires builds creates advances enhances AI Agents for Engineering coding agents evolving beyond simple codegeneration System Engineering Tasks tackling intricate engineering challenges,discovering APIs, connecting systems Custom Kernels optimizing performance with specializedcode for specific hardware Performance Optimization achieving faster execution throughtailored kernel development Agent Benchmarking measuring and comparing AI agentcapabilities and performance Multi-Agent Labs building autoresearch labs withinterconnected AI agents Autoresearch Labs enabling AI agents to conduct research anddevelopment autonomously AI System Engineering leveraging AI agents for complex systemdesign and implementation From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai AI Agents for Engineering enable System Engineering Tasks. System Engineering Tasks involves Custom Kernels. Custom Kernels leads to Performance Optimization. System Engineering Tasks requires Agent Benchmarking. AI Agents for Engineering builds Multi-Agent Labs. Multi-Agent Labs creates Autoresearch Labs. System Engineering Tasks advances AI System Engineering. Autoresearch Labs enhances AI System Engineering enable involves leads to requires builds creates advances enhances AI Agents forEngineering coding agentsevolving beyondsimple code… SystemEngineering Tasks tackling intricateengineeringchallenges,… Custom Kernels optimizingperformance withspecialized code… PerformanceOptimization achieving fasterexecution throughtailored kernel… AgentBenchmarking measuring andcomparing AI agentcapabilities and… Multi-Agent Labs buildingautoresearch labswith interconnected… Autoresearch Labs enabling AI agentsto conduct researchand development… AI SystemEngineering leveraging AIagents for complexsystem design and… From startuphub.ai · The publishers behind this format

The Role of AI Agents in System Engineering

Burtenshaw emphasized that AI agents are no longer just tools for writing snippets of code; they are evolving into sophisticated collaborators capable of tackling intricate engineering challenges. He pointed to the increasing acceptance of coding agents, citing examples from Andrej Karpathy and DHH who have been using them for years. This acceptance is growing as agents demonstrate their ability to perform tasks like discovering APIs, connecting systems, and even managing home automation devices.

Related startups

Custom Kernels and Performance Optimization

A significant portion of Burtenshaw's presentation focused on the creation and optimization of custom compute kernels, particularly for AI workloads. He explained the fundamental components of a kernel—a function compiled to run on a GPU and executed from Python—and highlighted the importance of optimizing these for efficiency. Burtenshaw showcased how custom kernels, like the popular Flash Attention, can significantly increase arithmetic density, reduce time spent communicating tensors, and ultimately keep GPUs running at optimal performance.

He also introduced Hugging Face's 'kernels' library, a platform designed to facilitate the building of compute kernels. This library aims to enforce a unified and predictable structure, ensure reproducibility, offer native PyTorch compatibility, and foster community sharing. Burtenshaw demonstrated how developers can publish their own kernels to the Hub, making them accessible to others.

Benchmarking and Agent Performance

To illustrate the effectiveness of agents in this domain, Burtenshaw presented benchmarking results. He shared how agents were used to generate CUDA kernels, which were then benchmarked and optimized. A specific example highlighted an average speedup of 1.94x on an H100 GPU for a Qwen3-8B model when using agents to generate and optimize kernels. This demonstrates the tangible performance gains achievable through agent-assisted engineering.

The Power of Multi-Agent Autoresearch Labs

Burtenshaw also delved into the concept of multi-agent autoresearch labs, outlining a system composed of specialized agents working collaboratively. This system includes:

  • Researcher: Scouts Hugging Face papers for ideas and defines research directions.
  • Planner: Acts as a central coordinator, owning the experiment queue and proposing hypotheses.
  • Worker Agents: Execute experiments, fetching code, and testing hypotheses.
  • Reporter: Monitors the progress of jobs, synchronizes status, and provides an overview of active jobs and anomalies.

This multi-agent approach allows for a systematic and automated exploration of hyperparameters and model architectures, leading to more efficient and effective research cycles. The use of tools like Trackio for monitoring and visualizing these experiments provides crucial insights into the research process.

Key Takeaways

Burtenshaw concluded with several key takeaways:

  • Agents work best with primitives and exposed, well-defined interfaces, rather than overly abstract ones.
  • The Hugging Face Hub is a robust platform ready to support AI workloads with core infrastructure for storage, compute, and versioning.
  • Multi-agent systems can be effectively structured with specialized roles to automate and accelerate AI research.

The presentation underscored the growing capabilities of AI agents in system engineering, highlighting their potential to drive efficiency and innovation in the field.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.