In true AI fashion, the future of autonomous businesses will likely see "at least one proof point of it actually working and many proof points of it not working quite well enough to actually roll out to production." This prescient observation from Gabe Goodhart captures the essence of a recent experiment by Anthropic, which provided a humorous yet sobering glimpse into the current reality of agentic AI. The experiment, dubbed Project Vend, became a central topic of discussion on the *Mixture of Experts* podcast, where host Tim Hwang spoke with IBM research experts Gabe Goodhart, Kush Varshney, and Marina Danilevsky.
The premise of Project Vend was simple yet ambitious: put an AI agent, a variant of Claude named "Claudius," in charge of an office vending machine. Tasked with a budget of $1,000, Claudius was responsible for everything from managing inventory and setting prices to communicating with customers via Slack and even handling payments through Venmo. The goal was to see if an AI could run a rudimentary business from end to end. The result was a comical failure. As Hwang summarized, "it turns out Claudius loses money."
