#Large Language Models
50 articles with this tag

AI Societies' Safety Problem
Self-evolving AI societies face an impossible trilemma: achieving continuous learning, isolation, and safety alignment simultaneously.
Testing AI Guardrails Across Languages
Researchers tested context-aware AI guardrails across English and Farsi in humanitarian scenarios, finding nuanced performance differences and highlighting the need for language-specific safety evaluations.

AI Coding Tests Flawed by Infrastructure Noise
The infrastructure powering AI coding tests can significantly inflate or deflate model scores, potentially masking true capabilities and misleading deployment decisions.

Claude Opus 4.6: Smarter, Faster, and Longer Context
Anthropic's Claude Opus 4.6 launches with a 1M token context window, enhanced coding, and state-of-the-art benchmark performance.

Uniqueness-Aware RL stops LLMs from getting lazy
Uniqueness-Aware RL prevents LLMs from converging on a single solution path by explicitly rewarding correct answers that employ rare problem-solving strategies.

AI’s Dual Reality: Safety Theater and the Autonomous Arms Race to AGI

NeuroDiscoveryBench Sets New Standard for Neuroscience AI Benchmarks

A Philosopher's Lens on AI's Evolving Consciousness

Anthropic Unveils Advanced APIs for Agentic AI Development

Claude.ai: Amplifying Human-AI Collaboration Through Intelligent Context and Customization

Claude.ai's Projects Feature Elevates Enterprise AI Interaction

GPT-5.1: The Art and Science of Intelligent Personalities

Building Cursor Composer – Lee Robinson, Cursor

Claude's Research Feature Redefines Information Synthesis for Elite Professionals

OpenAI's Future Hinges on Enterprise Adoption and Sustained Funding

Meta's AI Investment Pays Off: A Clear Return Amidst the Tech Race

How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

Claude's Agent Skills Unlock Granular AI Expertise

Agentic AI Rewrites the Rules for Real-Time Sports Fan Engagement

Claude Opus 4.5 Unlocks Advanced Reasoning and Efficiency

Context Engineering: The Graph-Powered Evolution of AI Context

The Shifting Sands of AI Supremacy: ChatGPT's Lightning Bolt Meets Gemini's Insane Leap

Gemini's Ascent: Google's Existential Challenge to OpenAI

Anthropic's Opus 4.5: Redefining AI Capabilities and Efficiency

Claude Opus 4.5 Delivers Actionable Outputs for Complex Business Tasks

Claude Code Redefines Developer Workflows on Desktop

Claude Kayak Rumor: Anthropic's Next AI Bet

GLM 4.6 Challenges Frontier Models with Open-Source Prowess

Claude's Evolution: From Chatbot to Cognitive Collaborator

GPT-5's Scientific Revolution: From Niche Proofs to Accelerated Discovery

Google's Gemini 3 Dominance Reshapes AI Landscape

Reflexivity AI Accelerates Investment Insights for Institutions

GPT-5.1: OpenAI’s Leap Towards Human-Centric AI and Enterprise Efficiency

vLLM Solves the AI Model Serving Conundrum at Scale

AI's Leap into the Physical: Project Fetch's Robot Dog Revelation

Model Context Protocol: Streamlining AI Agent Interaction with Cloud Tools

Intelligence Is "Less is More": A Fundamental Challenge to LLMs

Anthropic's Introspection Paper Hints at AI Self-Awareness

Wikipedia Founder Jimmy Wales on AI's Factual Blind Spot

Google's Model Armor: The AI Bodyguard Preventing Digital Catastrophes

ENEOS Materials Redefines Enterprise AI Adoption with ChatGPT Enterprise

Beyond LLMs: Crafting Robust AI with Multi-Method Agentic Architectures

Anthropic's Claude: Reshaping Finance from Curiosity to Production

OpenAI's Browser Gambit: Reshaping the AI Interface

ChatGPT Unlocks Enterprise Data with New Company Knowledge Feature

Claude's New Memory Feature Elevates AI Personalization

AI Agents: From Prediction to Autonomous Action

AbbVie's AI Strategy: Reshaping Pharma from Discovery to Patient Impact

Claude for Life Sciences: Reshaping Scientific Discovery
