#AI Alignment
8 articles with this tag
OpenAI Launches Safety Fellowship
OpenAI launches a new fellowship for external researchers focused on AI safety and alignment, offering stipends and mentorship.

AI Societies' Safety Problem
Self-evolving AI societies face an impossible trilemma: achieving continuous learning, isolation, and safety alignment simultaneously.

The Assistant Axis LLM: How Researchers Are Capping AI Drift
Scientists have mapped the internal neural space of LLMs, identifying the "Assistant Axis" as the key to stabilizing AI persona and preventing harmful behavior.

OpenAI is Debugging LLM Misalignment: New Tools Emerge
\n Researchers are tackling the challenge of understanding and correcting undesirable LLM behavior with a new technique called latent attribution , detailed by ...
OpenAI is Debugging LLM Misalignment: New Tools Emerge
\n Researchers are tackling the challenge of understanding and correcting undesirable LLM behavior with a new technique called latent attribution , detailed by ...

Emmett Shear on Building AI That Actually Cares: Beyond Control and Steering

Locai L1-Large beats GPT-5 on alignment using 'Forget-Me-Not'
