#AI Alignment

8 articles with this tag

OpenAI Launches Safety Fellowship

OpenAI launches a new fellowship for external researchers focused on AI safety and alignment, offering stipends and mentorship.

about 1 month ago

AI Research

AI Societies' Safety Problem

Self-evolving AI societies face an impossible trilemma: achieving continuous learning, isolation, and safety alignment simultaneously.

3 months ago

AI Research

The Assistant Axis LLM: How Researchers Are Capping AI Drift

Scientists have mapped the internal neural space of LLMs, identifying the "Assistant Axis" as the key to stabilizing AI persona and preventing harmful behavior.

4 months ago

Artificial Intelligence

OpenAI is Debugging LLM Misalignment: New Tools Emerge

\n Researchers are tackling the challenge of understanding and correcting undesirable LLM behavior with a new technique called latent attribution , detailed by ...

6 months ago

AI Research

OpenAI is Debugging LLM Misalignment: New Tools Emerge

\n Researchers are tackling the challenge of understanding and correcting undesirable LLM behavior with a new technique called latent attribution , detailed by ...

6 months ago

AI Video

Emmett Shear on Building AI That Actually Cares: Beyond Control and Steering

6 months ago

Artificial Intelligence

Locai L1-Large beats GPT-5 on alignment using 'Forget-Me-Not'

6 months ago

AI Video

AI's Alignment Imperative: A Race for Wisdom

10 months ago