Tim Hwang, host of IBM's "Mixture of Experts" podcast, recently convened a panel of IBM Senior Research Scientists Marina Danilevsky and Nathalie Baracaldo, alongside AI Research Engineer Sandi Besen, to dissect critical developments in artificial intelligence. Their discussion spanned the sobering reality of generative AI pilots, the revelation of a hidden prompt within GPT-5, and inherent flaws in large reasoning models. The conversation painted a picture of an industry grappling with misaligned expectations and the profound implications of AI's increasing autonomy.
A recent MIT Nanda Initiative report casts a stark shadow on the enterprise AI landscape, revealing that an astonishing 95% of generative AI pilots are falling short of expectations. This figure, as host Tim Hwang notes, indicates that initial deployments are "not really anywhere near the expectations of the people implementing them." Sandi Besen, while acknowledging the headline-grabbing nature of such a number, expressed a desire for deeper context on the study's methodology, particularly regarding how ROI was measured and who was surveyed. She found the 95% figure "too high for what I think the capabilities of this technology is," suggesting a fundamental disconnect.
