PRIMO R1: Active Critics for Robotic Manipulation

PRIMO R1 transforms video MLLMs into active critics for robotic manipulation via outcome-based RL, achieving SOTA on RoboFail and outperforming larger models.

Mar 17 at 8:01 PM2 min read

PRIMO R1: Active Critics for Robotic Manipulation

Long-horizon robotic manipulation has been hampered by the inability of current video MLLMs to actively evaluate task progress. These models, typically trained via Supervised Fine-Tuning (SFT), primarily function as passive observers rather than critical evaluators of the current state against the final task goal. The introduction of PRIMO R1, a 7B framework, marks a pivotal shift by transforming these models into active "Critics".

From Passive Observers to Active Critics

PRIMO R1 leverages outcome-based Reinforcement Learning to explicitly incentivize Chain-of-Thought generation for progress estimation. This approach fundamentally alters the model's role, moving beyond simple event recognition to a more analytical function. The architecture is further enhanced by constructing a structured temporal input, explicitly anchoring the video sequence between initial and current state images, providing crucial temporal context for reasoning.

Related startups

State-of-the-Art Performance and Generalization

The efficacy of PRIMO R1 is validated by extensive experiments on the proposed PRIMO Dataset and Benchmark, demonstrating state-of-the-art performance across diverse in-domain environments and out-of-domain real-world humanoid scenarios. Notably, the 7B PRIMO R1 model achieves a 50% reduction in mean absolute error compared to specialized reasoning baselines, and shows significant relative accuracy improvements over 72B-scale general MLLMs. Its capabilities extend to robust zero-shot generalization on challenging failure detection tasks. On the RoboFail benchmark, PRIMO R1 attains 67.0% accuracy, surpassing closed-source models like OpenAI o1 by 6.0%, underscoring its advanced capabilities in PRIMO R1 robot manipulation.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

#AI Research #Robotics #Machine Learning #Multimodal AI

AI Daily Digest

Get the most important AI news daily.

+40k readers