#AI Benchmarks
5 articles with this tag

Artificial Intelligence
Exa Unveils New Code Search Benchmarks
Exa.ai releases 'WebCode', a new benchmark suite for evaluating search performance in coding agents, addressing limitations in existing tools.
about 17 hours ago

Technology
Poetiq's Small Team Sets New Together AI Benchmark Records
Poetiq's small team is setting new Together AI benchmark records by leveraging recursively self-improving meta-systems to optimize existing LLMs.
25 days ago

Artificial Intelligence
Claude Sonnet 4.6 Ups the AI Ante
Anthropic's Claude Sonnet 4.6 launches with major upgrades in coding, reasoning, and computer use, plus a 1M token context window.
about 1 month ago

AI Research
Step 3.5 Flash: AI's New Efficiency Standard
Step 3.5 Flash AI model revolutionizes AI efficiency with a 196B parameter foundation and 11B active parameters, offering competitive performance with lower latency.
about 1 month ago

Artificial Intelligence
Claude Opus 4.6: Smarter, Faster, and Longer Context
Anthropic's Claude Opus 4.6 launches with a 1M token context window, enhanced coding, and state-of-the-art benchmark performance.
about 2 months ago