#Benchmark

11 articles with this tag

AI Inference Demand 'Won't Stop Anytime Soon,' Says Benchmark's Vishria

Benchmark partner Eric Vishria discusses the booming demand for AI inference and the current venture capital funding climate for AI startups.

22 days ago

AI Research

Workflow Agents Lag Behind Demand

New Claw-Eval-Live benchmark reveals LLM agents struggle with dynamic workflows and verifiable execution, with top models failing over a third of tasks.

about 2 months ago

AI Research

LLMs Plan, But Do They Plan Safely?

New LLM robotic safety benchmark, DESPITE, finds scale boosts planning but not safety. Proprietary models lead, revealing a critical gap for safe robotic deployment.

about 2 months ago

Artificial Intelligence

Gumloop Secures $50M Series B

Gumloop secures $50M Series B led by Benchmark to enhance its AI automation and agent platform for enterprises.

3 months ago

AI Research

Gemini Deep Research Unlocks Advanced AI for Devs

6 months ago

Startup News

Clarifai Hits Fastest GPT-OSS-120B Inference and Narrows the GPU, ASIC Gap

Clarifai’s latest benchmark on OpenAI’s GPT-OSS-120B model points to a quiet but important shift in AI infrastructure.

7 months ago

Startup News

Clarifai Hits Fastest GPT-OSS-120B Inference and Narrows the GPU, ASIC Gap

Clarifai’s latest benchmark on OpenAI’s GPT-OSS-120B model points to a quiet but important shift in AI infrastructure.

7 months ago

AI Research

Salesforce Agentic AI Gets Real-World Performance Benchmark

8 months ago

Funding Round

Applied Compute\'s Agent Workforce Targets Niche AI with $80M

A stealthy startup from ex-OpenAI researchers, Applied Compute, has emerged with $80 million in funding to argue that general-purpose AI is just the beginnin...

8 months ago

Startup News

Exa raises $85M to build a search engine for AIs

10 months ago

AI Video

Darwinian Evolution and Silicon Valley's AI Imperative

11 months ago