Achieving cooperation among self-interested AI agents has been a persistent challenge in reinforcement learning. Now, researchers from Google suggest a simpler path: let the agents learn from each other in a diverse environment. Forget complex, hardcoded rules about how opponents learn; instead, expose AI agents to a varied cast of characters, and they'll figure out how to get along.
The core idea hinges on the in-context learning capabilities of modern sequence models, the same technology powering large language models. These agents can adapt their behavior within a single interaction, effectively becoming 'naive learners' on a fast timescale. This rapid adaptation makes them vulnerable to 'extortion' by other learning agents.
This vulnerability, paradoxically, is the key. When two such agents try to 'extort' each other, their attempts to shape the opponent's learning dynamics inadvertently lead them towards more cooperative strategies. This mutual pressure, driven by the need to adapt to diverse opponents, resolves into cooperative behavior without explicit meta-learning or rigid timescale separations.
