#Benchmarking
7 articles with this tag
AI Research
FACTS Benchmark Suite Elevates LLM Factuality Scrutiny
about 2 months ago

AI Research
AI hits a wall on FrontierMath performance
4 months ago

AI Research
NVIDIA Blackwell benchmarks show staggering AI economics
4 months ago

AI Research
Salesforce AI, Berkeley Unveil BFCL Audio Benchmark for Voice AI Precision
Salesforce AI Research and UC Berkeley have unveiled BFCL Audio, a new benchmark designed to rigorously evaluate the precision of AI models in handling audio-native function calls. In an announcement on its blog, the collaboration...
5 months ago

AI Research
MoNaCo Benchmark: A New Standard for Complex Question Answering
The benchmark exposed weaknesses in today’s most advanced models. Researchers tested 15 frontier LLMs, including GPT-5, Anthropic Claude Opus 4, Google Gemini 25 Pro, and OpenAI’s reasoning-focused o3.
5 months ago

Artificial Intelligence
The Shifting Sands of AI: Benchmarks, Open Source, and Infrastructure Wars
7 months ago
Press Release
Temenos sets new benchmark for scalability of AI-powered banking with Microsoft
9 months ago