Preferred on Google

Sally-Ann Delucia on AI Agent Context Management

Sally-Ann Delucia of Arize discusses the challenges and strategies for context management in AI agents, highlighting the importance of memory and sub-agents.

May 10 at 8:01 PM4 min read

Sally-Ann Delucia presenting on Hierarchical Memory: Context Management in Agents — Image credit: StartupHub.ai· AI Engineer

Sally-Ann Delucia, Head of Product at Arize, recently shared insights into the complexities of context management for AI agents during an AI Engineer Europe event. Her presentation, "Hierarchical Memory: Context Management in Agents," highlighted the evolution from basic prompt engineering to more sophisticated strategies for handling the vast amounts of data and context that AI agents need to process.

Sally-Ann Delucia on AI Agent Context Management - AI Engineer — Sally-Ann Delucia on AI Agent Context Management — from AI Engineer

Understanding the Problem: Context is the New Engineering Challenge

Delucia began by framing context management not just as a technical hurdle but as a product and user experience problem. She referenced Andrej Karpathy's assertion that "The stack is changing. Context is the new engineering problem," underscoring the growing importance of how AI models receive and utilize information.

The core issue, as Delucia explained, is that AI agents often struggle with the sheer volume of data, leading to a "vicious loop" where increasing context can cause failures. This problem is exacerbated because users rarely restart conversations, allowing the context to grow organically and potentially overwhelm the agent's capabilities.

Related startups

Delucia illustrated this with a "naive truncation" approach, where only the first portion of the conversation or data is retained, and the rest is dropped. While simple, this method often results in the agent "forgetting everything," leading to fragmented and disconnected interactions. She emphasized that context engineering is about strategically choosing what the model sees, not just fitting within token limits.

Strategies for Effective Context Management

To escape this vicious loop, Delucia outlined three key strategies that Arize has been developing and implementing:

Control Context: This involves being deliberate about what information is fed to the AI model. It's not just about volume but about relevance and importance.
Separate Context from Memory: Delucia proposed a hierarchical approach where different types of context are managed separately. For instance, the main conversation might retain a lighter context, while heavier data and reasoning are offloaded to dedicated memory systems or sub-agents.
Move Heavy Work Out: This strategy involves creating specialized "sub-agents" that can handle computationally intensive tasks, such as complex data analysis or extensive search operations, without burdening the main agent. The results from these sub-agents are then fed back to the main agent, keeping the primary interaction streamlined.

Delucia showcased a diagram illustrating this shift from a "before" state, where the main conversation handled all chat history and heavy data, to an "after" state. In the "after" state, the main conversation is lighter, focusing on the core interaction, while a sub-agent handles the heavy data and complex reasoning.

The Role of Sub-Agents and Long-Term Memory

The concept of sub-agents emerged as a critical solution for managing complex tasks like search operations. These specialized agents can process large datasets and perform intricate reasoning, returning only the necessary results to the main agent. This modular approach helps maintain the efficiency and responsiveness of the primary AI conversation.

However, Delucia also highlighted the ongoing challenge of "long memory." As conversations lengthen and users interact across different platforms or sessions, agents need to retain context over extended periods. She noted that while their current system uses a form of memory by storing data with IDs for retrieval, the ultimate goal is to achieve "real long-term memory" that allows agents to recall and utilize information from past interactions effectively.

Key Takeaways and Future Directions

Delucia concluded by summarizing the key lessons learned:

Context management is iterative and requires continuous learning and adaptation.
Context engineering is paramount; it dictates what the AI model perceives and acts upon.
Memory management is vital for maintaining coherent and useful long-term interactions.
The realization that not all context needs to reside within a single agent has led to the development of more sophisticated, modular architectures using sub-agents.

Her final, powerful takeaway was:

"Agents don't fail because of prompts. They fail because of context."

This statement emphasizes the critical shift in focus from crafting perfect prompts to mastering the art and science of context management. Delucia encouraged the audience to consider how they can apply these principles to build more robust and capable AI agents.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

#Sally-Ann Delucia #Arize #AI Engineer Europe #Context Management #Large Language Models #AI Agents #Memory

AI Daily Digest

Get the most important AI news daily.

+40k readers