Artificial intelligence is no longer just a tool for end-user products; it's now a critical component in building more sophisticated AI itself. This shift is evident in how AI is optimizing infrastructure, training workflows, and the very systems used for AI development. LinkedIn Engineering began exploring this in August 2025, using agent loops to refine LLM post-training runs. The initial success was not just in task automation but in creating a structural loop of proposing, testing, measuring, and improving.
This realization spurred an internal project in January 2026 with a clear goal: leverage AI to enhance AI systems, necessitating platforms designed for a central role for agents. This led to a strategy focused on unifying three pillars for scaled experimentation: agents for distributed training code, comprehensive evaluation systems, and efficient GPU microscheduling. This framework enables agents to parallelize model trials with minimal human oversight.
Within this setup, agents optimize for both model quality and training efficiency in an inner loop. Once an optimal architecture is found, it's scaled through distributed training in an outer loop. This approach was first applied to migrating LinkedIn's large fleet of TensorFlow models to PyTorch, resulting in Autopilot for Torch. This specialized agent doesn't just convert; it iteratively refines generations based on LLM reasoning and verifier feedback.
The pattern quickly expanded to other use cases like kernel generation and auto-tuning, where agents autonomously search, evaluate, and enhance system performance. The core loop is a cycle of generate → verify → refine.