Hybrid Agents Master GUI-Tool Orchestration

ToolCUA agent overcomes hybrid action space uncertainty with a novel staged training pipeline, achieving state-of-the-art performance in GUI-Tool orchestration.

6 min read
Diagram illustrating the ToolCUA agent's staged training pipeline and hybrid action space.
The ToolCUA agent's staged training pipeline enables effective GUI-Tool orchestration.

The inherent uncertainty in hybrid action spaces—where Computer Use Agents (CUAs) can leverage both granular GUI interactions and high-level tool calls—hinders optimal execution. This challenge is compounded by the scarcity of quality interleaved GUI-Tool trajectories and the difficulty of collecting real-world tool usage data.

Visual TL;DR. Hybrid Action Space Uncertainty hinders ToolCUA Agent. Scarcity of Trajectories addressed by ToolCUA Agent. ToolCUA Agent uses Staged Training Pipeline. Staged Training Pipeline includes Trajectory Scaling Pipeline. Trajectory Scaling Pipeline enables Smarter Switching Decisions. Staged Training Pipeline leads to State-of-the-Art Performance.

Related startups

  1. Hybrid Action Space Uncertainty: difficulty in combining granular GUI and high-level tool calls
  2. Scarcity of Trajectories: lack of quality interleaved GUI-Tool trajectories and real-world tool data
  3. ToolCUA Agent: novel agent designed to overcome hybrid action space challenges
  4. Staged Training Pipeline: multi-phase approach for robust learning in complex action spaces
  5. Trajectory Scaling Pipeline: synthesizes diverse GUI-Tool trajectories from static GUI data
  6. Smarter Switching Decisions: bootstraps improved decision-making for tool and GUI interactions
  7. State-of-the-Art Performance: achieves superior results in GUI-Tool orchestration tasks
Visual TL;DR
Visual TL;DR — startuphub.ai Hybrid Action Space Uncertainty hinders ToolCUA Agent. Scarcity of Trajectories addressed by ToolCUA Agent. ToolCUA Agent uses Staged Training Pipeline. Staged Training Pipeline includes Trajectory Scaling Pipeline. Staged Training Pipeline leads to State-of-the-Art Performance hinders addressed by uses includes leads to Hybrid Action Space Uncertainty Scarcity of Trajectories ToolCUA Agent Staged Training Pipeline Trajectory Scaling Pipeline State-of-the-Art Performance From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Hybrid Action Space Uncertainty hinders ToolCUA Agent. Scarcity of Trajectories addressed by ToolCUA Agent. ToolCUA Agent uses Staged Training Pipeline. Staged Training Pipeline includes Trajectory Scaling Pipeline. Staged Training Pipeline leads to State-of-the-Art Performance hinders addressed by uses includes leads to Hybrid ActionSpace Uncertainty Scarcity ofTrajectories ToolCUA Agent Staged TrainingPipeline TrajectoryScaling Pipeline State-of-the-ArtPerformance From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Hybrid Action Space Uncertainty hinders ToolCUA Agent. Scarcity of Trajectories addressed by ToolCUA Agent. ToolCUA Agent uses Staged Training Pipeline. Staged Training Pipeline includes Trajectory Scaling Pipeline. Staged Training Pipeline leads to State-of-the-Art Performance hinders addressed by uses includes leads to Hybrid Action Space Uncertainty difficulty in combining granular GUI andhigh-level tool calls Scarcity of Trajectories lack of quality interleaved GUI-Tooltrajectories and real-world tool data ToolCUA Agent novel agent designed to overcome hybridaction space challenges Staged Training Pipeline multi-phase approach for robust learningin complex action spaces Trajectory Scaling Pipeline synthesizes diverse GUI-Tool trajectoriesfrom static GUI data State-of-the-Art Performance achieves superior results in GUI-Toolorchestration tasks From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Hybrid Action Space Uncertainty hinders ToolCUA Agent. Scarcity of Trajectories addressed by ToolCUA Agent. ToolCUA Agent uses Staged Training Pipeline. Staged Training Pipeline includes Trajectory Scaling Pipeline. Staged Training Pipeline leads to State-of-the-Art Performance hinders addressed by uses includes leads to Hybrid ActionSpace Uncertainty difficulty incombining granularGUI and high-level… Scarcity ofTrajectories lack of qualityinterleavedGUI-Tool… ToolCUA Agent novel agentdesigned toovercome hybrid… Staged TrainingPipeline multi-phaseapproach for robustlearning in complex… TrajectoryScaling Pipeline synthesizes diverseGUI-Tooltrajectories from… State-of-the-ArtPerformance achieves superiorresults in GUI-Toolorchestration tasks From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Hybrid Action Space Uncertainty hinders ToolCUA Agent. Scarcity of Trajectories addressed by ToolCUA Agent. ToolCUA Agent uses Staged Training Pipeline. Staged Training Pipeline includes Trajectory Scaling Pipeline. Trajectory Scaling Pipeline enables Smarter Switching Decisions. Staged Training Pipeline leads to State-of-the-Art Performance hinders addressed by uses includes enables leads to Hybrid Action Space Uncertainty difficulty in combining granular GUI andhigh-level tool calls Scarcity of Trajectories lack of quality interleaved GUI-Tooltrajectories and real-world tool data ToolCUA Agent novel agent designed to overcome hybridaction space challenges Staged Training Pipeline multi-phase approach for robust learningin complex action spaces Trajectory Scaling Pipeline synthesizes diverse GUI-Tool trajectoriesfrom static GUI data Smarter Switching Decisions bootstraps improved decision-making fortool and GUI interactions State-of-the-Art Performance achieves superior results in GUI-Toolorchestration tasks From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Hybrid Action Space Uncertainty hinders ToolCUA Agent. Scarcity of Trajectories addressed by ToolCUA Agent. ToolCUA Agent uses Staged Training Pipeline. Staged Training Pipeline includes Trajectory Scaling Pipeline. Trajectory Scaling Pipeline enables Smarter Switching Decisions. Staged Training Pipeline leads to State-of-the-Art Performance hinders addressed by uses includes enables leads to Hybrid ActionSpace Uncertainty difficulty incombining granularGUI and high-level… Scarcity ofTrajectories lack of qualityinterleavedGUI-Tool… ToolCUA Agent novel agentdesigned toovercome hybrid… Staged TrainingPipeline multi-phaseapproach for robustlearning in complex… TrajectoryScaling Pipeline synthesizes diverseGUI-Tooltrajectories from… Smarter SwitchingDecisions bootstraps improveddecision-making fortool and GUI… State-of-the-ArtPerformance achieves superiorresults in GUI-Toolorchestration tasks From startuphub.ai · The publishers behind this format

Synthesizing Hybrid Trajectories at Scale

Addressing this gap, the researchers introduce ToolCUA, an end-to-end agent employing a staged training approach. A core innovation is the Interleaved GUI-Tool Trajectory Scaling Pipeline. This pipeline repurposes abundant static GUI trajectories and synthesizes a grounded tool library, effectively generating diverse GUI-Tool trajectories without costly manual engineering or reliance on brittle real-world tool data collection. This allows for robust learning in complex action spaces.

Bootstrapping Smarter Switching Decisions

ToolCUA's training progresses through distinct phases. Initially, Tool-Bootstrapped GUI RFT combines supervised fine-tuning (SFT) with single-turn reinforcement learning (RL) to refine decisions at critical GUI-Tool switching junctures. This warmup phase is crucial for improving the agent's ability to discern when to transition between action modalities. Subsequently, the agent is optimized using Online Agentic RL within a high-fidelity GUI-Tool environment. A key element here is the Tool-Efficient Path Reward, which incentivizes not only correct tool utilization but also the discovery of shorter, more efficient execution paths. Experiments on OSWorld-MCP demonstrate the efficacy of this approach, with the ToolCUA agent achieving 46.85% accuracy—a substantial 66% relative improvement over baselines and a 3.9% gain over GUI-only methods, establishing a new state-of-the-art for comparable models.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.