Thinking Machines is undergoing a significant leadership shift. Co-founder and CTO Barret Zoph has departed the AI firm, as confirmed by CEO Mira Murati on Twitter. Stepping into the top technical role is Soumith Chintala, a move signaling a potential pivot or consolidation of technical strategy.
Murati praised Chintala as a "brilliant and seasoned leader" with over a decade in AI, highlighting his existing contributions to the team. This transition comes at a critical juncture for AI companies navigating rapid development and intense competition. Zoph's departure as a co-founder is notable, often indicating strategic divergence or internal restructuring.
Leadership Shakeup Signals AI Strategy Shift
Chintala's background is deep in the trenches of machine learning infrastructure, suggesting Thinking Machines may be doubling down on core engineering or specific research avenues where he has established expertise. For users and the broader industry, the change in technical stewardship at a company focused on advanced AI warrants attention.
It raises questions about the future direction of their product roadmap and research priorities following this executive transition. The speed of these high-level changes in the AI sector remains a defining characteristic of the current tech landscape.
Updated: Barret Zoph, Luke Metz, and Sam Schoenholz have joined OpenAI, as reported by CEO of Applications of OpenAI, Fidji Simo.
Barret Zoph, Luke Metz, and Sam Schoenholz are three of the most influential researchers in the modern era of Artificial Intelligence. While they all spent significant time at Google Brain and later became pillars of the research team at OpenAI.
Barret Zoph
Barret Zoph is widely regarded as one of the "architects" of the modern LLM post-training process. Before leaving OpenAI, he served as the Vice President of Post-Training, leading the teams responsible for making models like GPT-4 useful and safe for public interaction.
- Key Contributions: He is famous for his pioneering work on Neural Architecture Search (NAS), which uses AI to design better AI architectures.
- The "Instruction" Specialist: At OpenAI, he was instrumental in developing the Reinforcement Learning from Human Feedback (RLHF) pipelines that turned raw base models into the conversational ChatGPT we know today.
- Legacy: He is often cited as a bridge between "pure research" and "product-ready AI," ensuring that massive models actually follow user intent.
Luke Metz
Luke Metz is a researcher known for his deep expertise in how AI models learn and optimize. He spent several years at Google Brain before joining OpenAI, where he was a key contributor to the development of GPT-4 and the infrastructure behind o1 (OpenAI’s reasoning models).
- Key Contributions: Much of his research has focused on Learned Optimizers—using machine learning to create better algorithms for training other machine learning models.
- Systems & Reasoning: At OpenAI, he focused heavily on the intersection of deep learning and reasoning, helping the models move beyond simple word prediction toward more complex problem-solving.
- Reputation: He is known in the community as a "researcher’s researcher," tackling the fundamental bottlenecks of how models compute and iterate.
Sam Schoenholz
Sam Schoenholz is a physicist-turned-AI-researcher who brings a rigorous, mathematical approach to how neural networks function. Like Zoph and Metz, he was a long-time veteran of Google Brain before moving to OpenAI.
- Key Contributions: He is a co-author of the seminal work on $\mu$Transfer (Maximal Update Parameterization) alongside Greg Yang. This breakthrough allowed researchers to predict the behavior of massive models by training much smaller versions first, saving millions of dollars in compute costs.
- The Physics of Deep Learning: His work often explores the "geometry" of the loss landscape—essentially trying to understand the mathematical "terrain" an AI navigates as it learns.
- Scaling Laws: He has been a primary figure in refining the "Scaling Laws" that dictate how much more "intelligent" a model gets as you add more data and more chips.



