Locai Labs, a UK-based AI firm, has thrown a wrench into the frontier model race, announcing the open-source release of Locai L1-Large. The company claims its new model, built on the Qwen3 235B Instruct architecture, has surpassed leading proprietary models—including GPT-5, Claude 4.5 Sonnet, and Gemini Flash 2.5—on the critical Arena Hard v2 benchmark for conversational alignment and human preference.
The achievement is notable not just for the performance metrics, but for *how* Locai L1-Large was trained. Locai Labs developed a new post-training framework called "Forget-Me-Not," which allows the model to self-improve on downstream tasks without relying on expensive, human-labeled preference data.
Forget-Me-Not combines the concepts of experience replay (mixing in old data to prevent catastrophic forgetting) and self-improvement (where the model generates and grades its own training data). This process was targeted at aligning the model toward broad goals like helpfulness, conciseness, and factuality. The results appear to validate the methodology: Locai L1-Large showed a 2.1% improvement on Arena Hard v2 over the base Qwen model and delivered a 17% improvement on the AgentHarm safety benchmark, suggesting the self-judgment process effectively filtered out harmful outputs.
Crucially, Locai achieved this efficiency while operating under significant computational constraints compared to its US rivals. The 235-billion-parameter Mixture of Experts (MoE) model was fine-tuned using only one node of eight NVIDIA H200 GPUs, leveraging advanced techniques like Parameter Efficient Fine-Tuning (PEFT) and multi-dimensional parallelization. Locai also emphasized that training occurred in UK data centers powered by 100% renewable energy, aligning with its broader "Community AI" (Co-AI) vision focused on sustainability and sovereignty.
The British Alignment Strategy
While most frontier models aim for a generalized, often US-centric, cultural neutrality, Locai L1-Large has deliberately carved out a specific cultural niche. The company explicitly post-trained the model to default to British English spelling and grammar and enhance its cultural awareness around British norms.
To achieve this, Locai leveraged CultureBank, a community-driven knowledge base that captures validated cultural descriptors and judgments. The model was prompted to generate nuanced, culturally sensitive advice using this context, ensuring its responses are appropriate for a British audience without resorting to generic answers.
Beyond cultural alignment, Locai addressed a significant issue in global AI: the neglect of low-resource languages. Recognizing that poor performance in languages like Welsh, Irish, and Scottish Gaelic could accelerate their decline, Locai developed an instruction tuning dataset based on bidirectional translations sourced from movie subtitles. This effort aims to improve multilingual proficiency in languages with a small digital footprint, a move that directly benefits specific regional user bases often overlooked by larger AI labs.
For deployment, Locai further optimized the model by converting it into an FP8 variant using Post-Training Quantisation (PTQ). In a clever move, they used the model’s own self-improvement data as the calibration dataset, allowing them to halve the model size and double throughput while retaining performance—all without requiring any human-labeled samples.
Locai L1-Large is now available both as an open-source release on Hugging Face and powering a new Early Access web application called Locai.chat. By combining open-source transparency, extreme efficiency, and a novel self-improvement framework that bypasses costly human labeling, Locai Labs is challenging the notion that only heavily funded US and Chinese giants can compete at the frontier level of AI alignment.



