"The big thing that we are targeting is producing an automated researcher. So automating the discovery of new ideas." This bold declaration from OpenAI's Chief Scientist, Jakub Pachocki, encapsulates the ambitious trajectory of artificial intelligence, a vision profoundly shaped by the latest advancements in models like GPT-5.
In a recent a16z podcast, OpenAI's Chief Scientist Jakub Pachocki and Chief Research Officer Mark Chen engaged with general partners Anjney Midha and Sarah Wang. The discussion delved into the intricacies of GPT-5's launch, the evolution of AI evaluation, the surprising efficacy of reinforcement learning, and the unique research culture fostering these breakthroughs. A central theme emerged: the strategic pivot from mere "vibe coding" to a more profound "vibe researching" paradigm, where AI actively contributes to scientific discovery and economic impact.
GPT-5 represents a significant evolution, moving beyond previous iterations that primarily excelled as "instant response models." Mark Chen highlighted the core innovation: integrating long-horizon reasoning capabilities directly into the model's default behavior. "GPT-5 was really our attempt to bring reasoning into the mainstream," Chen explained, noting the prior distinction between fast-response GPT models and slower, more deliberative 'O-series' models. The goal is to eliminate the user’s burden of choosing the right mode, offering optimal reasoning by default. This fusion of speed and depth signals a maturation of AI, enabling it to tackle problems requiring sustained, complex thought processes rather than just rapid pattern matching.
As AI models approach human-level performance on traditional benchmarks, the methods of evaluation must also evolve. Jakub Pachocki acknowledged the saturation of existing "evals," where improvements from 96% to 98% no longer signify groundbreaking progress. OpenAI’s focus is shifting towards tangible, real-world impact. The next set of evaluations and milestones, Pachocki revealed, "will involve actual movement on things that are economically relevant." This isn't just about passing tests; it's about models demonstrating the capacity for genuine discovery, generating novel ideas with practical applications. This pursuit of "automated researchers" signifies a profound redefinition of AI's role, moving it from an assistant to a co-creator of knowledge.
Reinforcement Learning (RL) continues to defy skeptics, proving to be an unexpectedly powerful tool in this new frontier. Chen and Pachocki emphasized RL's success in domains like mathematics and programming competitions. Through RL, models can be trained to become experts in specific domains, capable of reasoning deeply and consistently without "going off track." This ability to "reason very hard about it" allows AI to excel in complex, verifiable tasks, pushing the boundaries of what was previously thought possible. Indeed, the models are now outperforming human experts in certain competitive programming challenges, illustrating a rapid ascent in problem-solving prowess.
The impact of these advancements is already transforming fields like software development. The concept of "vibe coding," where developers intuitively interact with AI to generate code, is becoming the default. This marks a shift from merely automating repetitive tasks to a collaborative process where AI accelerates and enhances human creativity. The future, as Chen mused, "hopefully will be vibe researching," extending this symbiotic relationship into the realm of fundamental scientific inquiry. This collaborative evolution underscores AI's potential not just to augment, but to fundamentally alter how knowledge is created and applied across industries.
OpenAI’s success in navigating these complex research challenges is deeply intertwined with its unique organizational culture. The company prioritizes attracting "cave-dweller" talent—individuals deeply focused on fundamental research, unswayed by fleeting trends. Leadership strives to protect this core research function within a product-driven company, ensuring researchers have the space and resources to pursue ambitious, long-term goals without constant pressure for immediate productization. This delicate balance, coupled with a commitment to transparency and a clear, shared vision, fosters an environment where groundbreaking discoveries are not just possible, but expected. The emphasis on clear hypotheses, rigorous self-assessment, and learning from inevitable failures forms the bedrock of their collective progress.

