Code reviews are essential for catching bugs and sharing knowledge, but they often become a significant bottleneck for engineering teams. Cloudflare experienced this firsthand, with merge requests waiting hours for initial feedback. To tackle this, they explored AI code review solutions.
Initial attempts with existing AI tools showed promise but lacked the customization needed for an organization the size of Cloudflare. A more direct approach, feeding raw diffs into large language models with basic prompts, resulted in a flood of vague, often inaccurate suggestions. This led Cloudflare to develop a CI-native orchestration system around OpenCode, an open-source coding agent.
Orchestrating AI Code Review at Scale
Cloudflare's current system deploys a coordinated group of specialized AI agents for each merge request. Instead of a single, monolithic model, up to seven distinct reviewers focus on areas like security, performance, code quality, documentation, release management, and compliance with their internal Engineering Codex. A central coordinator agent manages these specialists, deduplicating findings, assessing severity, and consolidating feedback into a single, structured comment.
This system has been rigorously tested across tens of thousands of merge requests, effectively approving clean code, flagging genuine bugs with high accuracy, and blocking merges for critical issues or security vulnerabilities. This initiative is part of Cloudflare's broader strategy for improving engineering resiliency, known as Code Orange: Fail Small.
Modular Architecture for Flexibility
Building internal tooling that spans thousands of repositories requires extreme flexibility. Cloudflare opted for a composable plugin architecture to avoid hardcoding dependencies on specific version control systems or AI providers. This design ensures the system can adapt to future changes, such as supporting new VCS platforms or integrating different AI models.
