Nano Banana and the emergence of a 100-page prompt, as detailed in a recent "Mixture of Experts" podcast featuring Tim Hwang, Aaron Baughman, Chris Hay, and Lauren McHugh Olende, underscores a critical inflection point in AI development. While often seen as a trivial input, KPMG’s use of such an extensive prompt for their agentic TaxBot, designed to generate 25-page advisory opinions, highlights the escalating complexity and specificity required to align large language models with intricate business processes. This meticulous crafting of instructions reveals that prompt engineering is far from a dying art; it is evolving into a sophisticated discipline.
Aaron Baughman, an IBM Fellow, shared his own experience with a 40-page prompt, emphasizing its effectiveness in summarizing complex manuals. Chris Hay, a Distinguished Engineer, concurred, stating that he is "not surprised" by such lengthy prompts. He argues that if a model lacks domain knowledge, embedding that information directly into the context window is a pragmatic solution, preferring it to "roll[ing] the dice at RAG." Lauren McHugh Olende, however, offered a sharper critique: "If a product requires a 100-page user manual to work, at best it's poorly designed, at worst it's broken." This succinctly captures the tension between brute-force contextualization and the pursuit of more elegant, perhaps fine-tuned, model architectures.
Another significant topic was OpenAI's potential pivot into selling compute infrastructure, a strategic move hinted at by their CFO. This shift, akin to Amazon's early move to monetize its excess e-commerce infrastructure with AWS, suggests a recognition of the rapidly changing economics of AI. Lauren McHugh Olende insightfully pointed out the likely emergence of a "second-hand GPUs" market, as OpenAI continually upgrades its hardware for cutting-edge research and commercial offerings. Chris Hay echoed this, noting that if it's cheaper to run the latest GPUs for training, older versions can be lucratively rented for inference workloads. This indicates a maturing ecosystem where even "obsolete" hardware for frontier models retains significant value, driven by the insatiable demand for AI compute. Aaron Baughman viewed this as a strategic play to transition OpenAI's dependency on Azure into a more balanced "collaboration," while acknowledging the "trillion dollar investment" required.
The discussion then turned to Google Gemini's nano banana image generation model, which garnered enthusiastic praise from Chris Hay, who declared it "by far the best image generation model that I've seen today." He demonstrated its remarkable ability to perform multi-turn editing, changing expressions, backgrounds, and even adding new figures while maintaining physical consistency. This capability extends beyond mere image generation, venturing into sophisticated editing and contextual understanding. Lauren McHugh Olende highlighted the importance of this "editing-focused model" for non-malicious applications, such as generating synthetic data to overcome challenges like cloud cover in geospatial imagery for NASA.
Finally, the podcast showcased IBM's AI experimentations at the US Open. Aaron Baughman unveiled three fan-centric features powered by IBM watsonx-built generative AI models: Match Chat, Key Points, and Live Likelihood to Win. Match Chat provides real-time answers to fan queries, Key Points offers concise summaries of match articles, and Live Likelihood to Win leverages predictive modeling to display dynamic win probabilities throughout a match. These applications exemplify how AI is being deployed to deepen fan engagement and understanding of complex real-time events.
The rapid evolution of AI, from immense prompts to infrastructure shifts and increasingly sophisticated multimodal generation with nano banana, continues to redefine the boundaries of what is possible, often at a breathtaking pace.

