Llamabench

LlamabenchLlamabench
Llamabench

Llamabench

Crowd-sourced local LLM benchmarks for any hardware, showing exact configurations for optimal token/sec performance.

Active
Rate

About

Llamabench is a crowd-sourced platform for benchmarking local Large Language Models (LLMs). It allows users to discover the exact hardware configurations and settings, such as quantization and KV cache, that achieve optimal tokens-per-second performance. The service serves GPU buyers, quant tinkerers, and inference engineers looking to understand and improve their local LLM rig performance.

Technology stack

detected 2026-07-02
CDN
Cloudflare
Emailunknown
Stack
SvelteKitTailwind CSSBootstrapGhost
Comments

No comments yet. Be the first to share your take.

Frequently asked

What does Llamabench do?

Llamabench is a crowd-sourced platform for benchmarking local Large Language Models (LLMs). It allows users to discover the exact hardware configurations and settings, such as quantization and KV cache, that achieve optimal tokens-per-second performance. The service serves GPU buyers, quant tinkerers, and inference engineers looking to understand and improve their local LLM rig performance.

What industry does Llamabench operate in?

Llamabench operates in Foundation Model, Large Language Model, AI Infrastructure, AI Testing, Developer Tools, Benchmarking.