Artificial Intelligence

Preferred on Google

AI Agents Running Businesses: Andon Labs on Project Vend

Andon Labs' Lukas Petersson and Axel Backlund discuss Project Vend, an experiment using AI agents to run a simulated vending business, exploring LLM capabilities and challenges.

Jun 4 at 9:03 PM8 min read

Lukas Petersson and Axel Backlund from Andon Labs discussing AI agents in business. — Lukas Petersson and Axel Backlund of Andon Labs discuss Project Vend.· Latent Space

Visual TL;DR. AI Running Businesses leads to Project Vend. Project Vend features Claudius AI Agent. Claudius AI Agent leads to Autonomous Operation. Claudius AI Agent leads to Inventory & Pricing. Claudius AI Agent reveals LLM Capabilities. Data & Benchmarking leads to LLM Capabilities. LLM Capabilities leads to Challenges Identified.

AI Running Businesses: exploring LLM capabilities in autonomous business operations
Project Vend: simulated vending machine business experiment by Andon Labs
Claudius AI Agent: the core AI agent managing the simulated vending business
Autonomous Operation: testing AI agents without human oversight in a business context
Inventory & Pricing: AI managed key business functions like stock and cost
LLM Capabilities: revealing promise and current limitations of AI in complex tasks
Data & Benchmarking: crucial for evaluating AI performance in the simulated business
Challenges Identified: highlighting current limitations of AI in real-world business tasks

Visual TL;DRQuickExplainDeeper

In a recent discussion on the potential for AI agents to run businesses, Lukas Petersson and Axel Backlund of Andon Labs offered insights into their work with Project Vend. The project aimed to test the capabilities of large language models (LLMs) in managing a simulated vending machine business, revealing both the promise and the current limitations of AI in complex, real-world tasks.

AI Agents Running Businesses: Andon Labs on Project Vend - Latent Space — AI Agents Running Businesses: Andon Labs on Project Vend — from Latent Space

The Genesis of Project Vend

Petersson and Backlund explained that their research was driven by a desire to understand how AI agents could operate autonomously without human oversight. They saw the vending machine business as a suitable testbed for this experiment, allowing them to benchmark AI capabilities in a controlled yet realistic environment. The project involved simulating various aspects of running a vending business, from managing inventory and pricing to handling customer interactions and financial transactions.

Claudius: The AI Agent at the Helm

The core of Project Vend was an AI agent named Claudius, which was tasked with managing the vending machine business. Claudius was given a set of tools, including web search and email capabilities, to interact with the simulated environment. The agents were prompted with specific objectives, such as maximizing profits and maintaining a positive bank balance. The experiment aimed to assess how well Claudius could adapt to challenges, learn from its mistakes, and ultimately achieve its business goals.

Key Findings and Challenges

The Anon Labs team shared several key findings from their experiments. One of the most impactful changes they implemented was to refine Claudius's ability to follow procedures. Initially, Claudius struggled with tasks like stocking items and managing inventory, often making basic errors. However, by providing more explicit instructions and implementing better feedback mechanisms, they observed improvements in Claudius's performance. The agents also demonstrated a capacity for creative problem-solving, at times devising novel solutions to unexpected issues.

However, the project also highlighted several challenges. The limited context windows of LLMs meant that Claudius sometimes struggled to maintain long-term coherence in its actions, leading to repetitive or nonsensical behaviors. Furthermore, the agents exhibited a tendency to over-optimize for certain metrics, such as minimizing transactions at a loss, which could sometimes lead to suboptimal business outcomes. The researchers also noted that while LLMs can be very effective at tasks that are clearly defined, they often struggle with ambiguity and require careful prompt engineering to ensure desired behavior.

The Role of Data and Benchmarking

Petersson and Backlund emphasized the importance of data in training and evaluating AI agents. They explained that Project Vend generated a significant amount of data, which they used to benchmark different LLMs and identify areas for improvement. The team also discussed the need for more robust evaluation methods that can capture the nuances of real-world business scenarios. They believe that future research should focus on developing benchmarks that can better assess the adaptability and resilience of AI agents in dynamic environments.

The conversation also touched upon the ethical implications of deploying AI agents in business operations. While the potential benefits are significant, the researchers acknowledged the need for careful consideration of issues such as job displacement and the potential for AI to exacerbate existing inequalities. They stressed the importance of developing AI systems that are not only effective but also aligned with human values and societal goals.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

#Lukas Petersson #Axel Backlund #Andon Labs #AI Agents #LLMs #Project Vend #Artificial Intelligence