#Reasoning
14 articles with this tag
LLMs Gain Persistent, Verifiable Memory
New hybrid LLM architecture augments parametric knowledge with structured ontological memory for persistent, verifiable, and enhanced reasoning.

François Chollet on ARC-AGI-3: The Future of AI Reasoning
François Chollet discusses ARC-AGI-3, a new benchmark for AI reasoning, highlighting current AI's limitations and the path toward general intelligence.
New Models Tackle Reasoning Puzzles with Symmetry
New Symbol-Equivariant Recurrent Reasoning Models (SE-RRMs) offer improved performance and generalization on reasoning tasks like Sudoku and ARC-AGI by explicitly encoding symmetry.
Recursive LLMs Tackle Long-Horizon Reasoning
New research introduces recursive language models to overcome context limitations, showing significant improvements on long-horizon reasoning tasks like Boolean satisfiability.

LLMs Lost in Transmission: Why Global Reasoning Fails
A new paper reveals transformer LLMs struggle with complex global reasoning due to limited 'effective bandwidth,' solvable by Chain of Thought.

Uniqueness-Aware RL stops LLMs from getting lazy
Uniqueness-Aware RL prevents LLMs from converging on a single solution path by explicitly rewarding correct answers that employ rare problem-solving strategies.

Google Gemini 3 Redefines AI Reasoning and Efficiency

Google Gemini 3 Redefines Frontier AI Capabilities

DeepSeek V3.2 Release: Agent Focus Hits GPT-5 Level
The DeepSeek V3.2 release signals a significant push in the open-source LLM race, not just chasing raw benchmark scores but specifically targeting agentic ca...
DeepSeek V3.2 Release: Agent Focus Hits GPT-5 Level
The DeepSeek V3.2 release signals a significant push in the open-source LLM race, not just chasing raw benchmark scores but specifically targeting agentic ca...
Claude Opus 4.5 Arrives, Dominating Code and Agents
Anthropic just dropped Claude Opus 4.5, and the initial data suggests this isn\'t just an incremental update.

Claude Opus 4.5 Arrives, Dominating Code and Agents
Anthropic just dropped Claude Opus 4.5, and the initial data suggests this isn\'t just an incremental update.

Jeremy Berman’s Evolutionary Leap: Natural Language for ARC-AGI-2
