Llamabench

Llamabench
Crowd-sourced local LLM benchmarks for any hardware, showing exact configurations for optimal token/sec performance.
Rate
About
Llamabench is a crowd-sourced platform for benchmarking local Large Language Models (LLMs). It allows users to discover the exact hardware configurations and settings, such as quantization and KV cache, that achieve optimal tokens-per-second performance. The service serves GPU buyers, quant tinkerers, and inference engineers looking to understand and improve their local LLM rig performance.
Technology stack
detected 2026-07-02CDN
Cloudflare
Emailunknown
Stack
SvelteKit
Tailwind CSS
Bootstrap
Ghost
Comments
No comments yet. Be the first to share your take.
Frequently asked
What does Llamabench do?
Llamabench is a crowd-sourced platform for benchmarking local Large Language Models (LLMs). It allows users to discover the exact hardware configurations and settings, such as quantization and KV cache, that achieve optimal tokens-per-second performance. The service serves GPU buyers, quant tinkerers, and inference engineers looking to understand and improve their local LLM rig performance.
What industry does Llamabench operate in?
Llamabench operates in Foundation Model, Large Language Model, AI Infrastructure, AI Testing, Developer Tools, Benchmarking.