CRAB
CRAB
C

CRAB is a general-purpose agent benchmark framework for Multimodal Language Model (MLM) agents, providing an end-to-end, easy-to-use framework to build agents, operate environments, and create benchmarks.

About
CRAB (Cross-environment Agent Benchmark) is designed as a comprehensive framework for evaluating Multimodal Language Model (MLM) agents. It offers an end-to-end solution for building agents, operating diverse environments, and creating benchmarks. Key components include cross-environment support, a graph evaluator for fine-grained performance analysis, and automated task generation using a graph-based method. The framework supports multiple environments like Ubuntu and Android, allowing agents to adapt across different interfaces. CRAB Benchmark-v0, developed using this framework, includes 120 tasks across two environments, tested with six different MLMs under three communication settings, with results available on a leaderboard.

Performance

Company Timeline

No timeline data for this period

Score Breakdown
25
Traction
0
Team
70
Visibility
12
Profile
50
Community
0
Discussion (0)

Join the discussion

No comments yet. Be the first to share your thoughts!

Frequently Asked Questions
What does CRAB do?
CRAB (Cross-environment Agent Benchmark) is designed as a comprehensive framework for evaluating Multimodal Language Model (MLM) agents. It offers an end-to-end solution for building agents, operating diverse environments, and creating benchmarks. Key components include cross-environment support, a graph evaluator for fine-grained performance analysis, and automated task generation using a graph-based method. The framework supports multiple environments like Ubuntu and Android, allowing agents t…
When was CRAB founded?
CRAB was founded in 2024.
Contact Info
Similar Startups