GitHub Cuts Agentic Workflow Costs

GitHub implements new strategies to cut token costs in its automated agentic workflows by enhancing logging and optimizing tool usage.

Diagram illustrating token flow in GitHub Agentic Workflows with optimization points highlighted.
GitHub's approach to optimizing token usage in automated workflows.· Github Blog

Automated workflows can silently inflate API bills. GitHub is tackling this head-on by optimizing its own GitHub Agentic Workflows token efficiency. These automated systems, designed to maintain code quality and perform CI tasks, run frequently and can incur significant costs without direct oversight.

Unlike interactive AI sessions, the predictable nature of YAML-defined workflows allows for systematic optimization. GitHub's engineering and security teams recognized the need to manage token usage, mirroring concerns of their user base.

Logging Token Consumption

The first step involved understanding where tokens were being spent. A challenge emerged from the inconsistent logging formats across different agent frameworks. To solve this, GitHub leveraged its API proxy, which sits between agents and authentication credentials, to capture usage data in a standardized format.

Every workflow now generates a token-usage.jsonl artifact. This log details input tokens, output tokens, cache reads/writes, model, provider, and timestamps for each API call. This data provides a historical view essential for identifying inefficiencies.

Related startups

Automated Optimization Workflows

Two daily workflows were developed to analyze and address token usage: the Daily Token Usage Auditor and the Daily Token Optimizer.

The Auditor aggregates consumption data, flags workflows with escalating usage, and identifies anomalous runs. When an issue is detected, the Optimizer analyzes the workflow's source code and logs to create a GitHub issue detailing specific inefficiencies and proposing solutions.

This creates a virtuous cycle, as the Auditor and Optimizer themselves are agentic workflows whose token usage is also monitored.

Eliminating Unused MCP Tools

A primary inefficiency identified was the inclusion of unused tool registrations within agent configurations. Because LLM APIs are stateless, agent runtimes often send the full list of available tool function names and JSON schemas with each request.

For a server with 40 tools, this can add substantial overhead to every API call, even if only a few tools are actually used. The Optimizer identifies workflows that consistently use a narrow set of tools and recommends pruning the rest.

Removing unused tools in smoke-test workflows reduced per-call context size by 8-12 KB, saving thousands of tokens per run without impacting behavior. This addresses a key aspect of GitHub Agentic Workflows token efficiency.

Replacing GitHub MCP with GitHub CLI

A more significant optimization involved replacing GitHub MCP (Machine-assisted Code Processing) calls for data retrieval with direct calls to the GitHub CLI. MCP calls involve an LLM reasoning step to decide on and execute a tool, consuming tokens for schema, arguments, and responses.

In contrast, GitHub CLI commands like gh pr diff are deterministic HTTP requests to the GitHub REST API, bypassing LLM involvement entirely. This shift moves much of the data fetching out of the LLM reasoning loop.

Two strategies were employed: pre-agentic data downloads using gh commands before the agent starts, and an in-agent CLI proxy substitution for runtime-determined fetches. This reduces token usage while maintaining security.

Measuring efficiency gains proves challenging.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.