• StartupHub.ai
    StartupHub.aiAI Intelligence
Discover
  • Home
  • Search
  • Trending
  • News
Intelligence
  • Market Analysis
  • Comparison
  • Market Map
Workspace
  • Email Validator
  • Pricing
Company
  • About
  • Editorial
  • Terms
  • Privacy
  1. Home
  2. AI News
  3. Voxel51 Research Reveals Auto Labeling Achieves Up To 95 Of Human Level Performance While Cutting Costs By 100000x
Back to News
Startup news

Voxel51 Research Reveals Auto-Labeling Achieves up to 95% of Human-Level Performance While Cutting Costs by 100,000x

Quantitative study shows how companies can save on annotation costs and make AI model development 5,000x faster.

Voxel51
Voxel51
Jun 4, 2025 at 3:00 PM3 min read
Voxel51 Research Reveals Auto-Labeling Achieves up to 95% of Human-Level Performance While Cutting Costs by 100,000x

Voxel51, the most powerful visual AI data platform, today released groundbreaking research showing that auto-labeling technology can achieve accuracy nearly equivalent to human labeling (up to 95%) while operating 5,000x faster than traditional annotation methods. Labeling costs can be reduced by up to 100,000x—potentially saving millions of dollars in AI development costs. 

Read research on how zero-shot auto-labeling rivals human performance.

"Our research shows that data annotation no longer has to be a multi-million-dollar line item," said Jason Corso, Co-founder and Chief Science Officer at Voxel51. "While previous research has qualitatively claimed auto-labeling reduces annotation costs, our study provides concrete figures that have significant implications. The findings reflect the potential for a massive reduction in costs for data labeling, enabling AI developers to invest more of their budget and human workforce on more effective data curation, quality assurance, model and edge-case analysis, and strategic dataset expansion.”

Essential to powering computer vision, data labeling has traditionally been a tedious, costly, and slow process. To determine whether auto-labels alone could produce high-performing models in real-world scenarios, Voxel51’s Auto-Labeling Data for Object Detection study benchmarks leading foundation models—including YOLOE, YOLO-World, and Grounding DINO—across four widely-used datasets: Berkeley Deep Drive (BDD autonomous driving), Common Objects in Context (COCO), Large Vocabulary Instance Segmentation (LVIS high complexity), and Visual Object Classes (VOC general imagery). These datasets span basic object categories to challenging, long-tail distributions.

Using mean Average Precision (mAP), a key real-world metric for object detection accuracy, the study found that models trained solely on auto-labels performed just as well—and sometimes even better—than models trained on traditional human labels. 

Voxel51 Research Key Findings

  • Results showed AI-generated labels can achieve about 90–95% of the performance of human labeling for as much as 100,000x in cost savings and for cutting time by 5,000x.

For example, labeling 3.4 million objects on a single NVIDIA L40S GPU cost only $1.18 and took just over an hour. In comparison, manually labeling the same dataset via AWS SageMaker, which has among the least expensive annotation costs, would cost roughly $124,092 and take nearly 7,000 hours. 

  • In certain cases—such as detecting rare classes in COCO or VOC—auto-label-trained models occasionally outperformed those trained on human labels. This may occur because foundation models, trained on massive datasets, can generalize better than humans across diverse objects or more consistently label challenging edge cases.
  • While auto-labels achieve close to the performance of human labeling in many practical scenarios, careful consideration of dataset complexity and class definitions remains essential. For specialized or particularly challenging categories, teams should consider adopting hybrid annotation strategies, combining auto-labeling’s scalability with targeted human expertise. 

The full research report is available for download on the Voxel51 website.

#Computer Vision
#Data Labeling

AI Daily Digest

Get the most important AI news daily.

GoogleSequoiaOpenAIa16z
+40k readers