The world of sports, an arena traditionally driven by raw human performance and unpredictable drama, is now being profoundly reshaped by the intricate dance of agentic artificial intelligence and large language models. IBM's pioneering work, unveiled by IBM Fellow and Master Inventor Aaron Baughman, showcases a real-time interactive AI system, powered by an agentic graph, designed to deliver unparalleled insights to tennis fans at the Wimbledon Championships and the US Open. This sophisticated architecture doesn't merely present data; it crafts dynamic narratives, offering a glimpse into the future of intelligent fan experiences.
Baughman illustrates how this groundbreaking assistant empowers fans to ask live questions during singles matches, receiving "instant and insightful answers, right at your fingertips." The user experience is meticulously designed, beginning with a choice between live match tracking or a recap. Upon entering the "match chat," users are gently guided with pre-curated questions, a technique Baughman terms "classic UX priming," intended to "lower the barrier for engagement, spark your curiosity, and invite participation throughout a match." More inquisitive minds can leverage an open field to pose any query, ensuring no curiosity remains unanswered. This intuitive interface, seamlessly mirrored across both mobile and desktop, provides a consistent and device-agnostic interaction model, whether one is courtside or at home.
At the heart of this innovation lies a robust event-driven architecture, built on a publish-subscribe messaging system. As a match progresses, the system ingests a continuous stream of scoring and performance data, immediately publishing it to on-demand topics for near real-time availability. Simultaneously, it writes dozens of JSON files to cloud object storage buckets, fronted by Content Delivery Networks (CDNs) to ensure high-speed global distribution and caching.
Once a user submits a query, it traverses secure firewalls and CDNs, landing in a containerized middleware application with 30 active replicas across multiple cloud regions. This middleware app first analyzes and interprets the query using a mini-LLM (L6V2 model) to generate numerical embeddings. These embeddings are then passed through a random forest of 100 decision trees, classifying the query into specific tennis categories such as player stats, match logistics, or live insights.
Crucially, the system incorporates a "Hateful, Abusive, Profanity" (HAP) filter to screen all questions, ensuring conversations remain safe and respectful. This small touch adds transparency to the AI's cognitive process and maintains user engagement.
After classification and moderation, a decision point determines the next step. If the query confidently fits a known category, it proceeds to a custom extension layer, a powerhouse application running on over 60 replicas across a multi-region Kubernetes platform. If confidence is low, or ambiguity is flagged, the query is routed to a knowledge-based system, acting as a fallback to provide thoughtful, pre-trained responses for broader, less specific inquiries. The custom extension layer, using various tools, extracts relevant information about the classified question, formatting the data as either raw JSON (preserving original schema) or LLM JSON (decorative text for enhanced LLM comprehension). Finally, a generative agent formulates the answer. Should the agents struggle to confidently respond due to insufficient data or real-time play lag, a lightweight LLM prompt acts as a final synthesizer, attempting a conclusive synthesis of information fragments remaining throughout the pipeline. This intricate process ensures the architecture "balances scale, speed, safety, as well as accuracy."
Related Reading
- AI Agents Usher in Self-Healing Infrastructure at Railway
- PydanticAI and Temporal Forge Durable, Type-Safe AI Agents
- Backlog.md Unleashes AI Agents with Terminal-Native Task Management
Beyond static data, the system provides dynamic, real-time insights through its "Likelihood to Win" (L2W) estimates. Prior to a match, a pre-match L2W model, based on head-to-head history and other predictors, provides initial winning probabilities. As the match unfolds, the system transitions to a live L2W model, continuously updating probabilities after every single point. This model uses a probabilistic equation, where the odds of a player winning are constantly re-evaluated given the live evidence. A "decayed pre-match probability" gradually diminishes the influence of initial predictions as real-time match data becomes more significant. Furthermore, a booster function activates during critical match events, ensuring that pivotal moments are accurately reflected in the L2W estimates.
This live L2W visualization serves as both a statistical output and a narrative, capturing the ebb and flow of match momentum and identifying crucial turning points. This messaging infrastructure, leveraging MQTT, ensures that as each scoring event occurs, data is published to relevant topics, enabling a highly parallelized and fast recalculation of L2W at that exact moment of play. The resulting values are serialized and stored in CDNs and cloud object storage, allowing asynchronous access for fans worldwide. By blending AI, streaming data, generative AI, and predictive modeling with a smart user experience, IBM transforms raw match data into clear, engaging narratives for tennis fans globally.

