In a recent presentation, Martin Keen, a Master Inventor at IBM, delves into the critical concept of "Agentic Storage" for artificial intelligence systems. Keen, a seasoned innovator with a deep understanding of complex systems, outlines the challenges and solutions for enabling AI agents to effectively interact with and leverage vast amounts of data stored across diverse systems.
Understanding Agentic AI and its Storage Needs
Keen begins by clarifying that agentic AI systems, powered by Large Language Models (LLMs), are not merely conversational chatbots. These agents are designed to perform actions, write code, and remediate incidents autonomously. However, a key limitation of current LLMs is their reliance on a finite 'context window' – essentially, their short-term memory. This means that without external data, their ability to perform complex, long-term tasks is severely restricted. This is where the need for robust and accessible storage solutions arises.
The Role of RAG and Vector Databases
To overcome the context window limitation, Keen highlights the importance of Retrieval Augmented Generation (RAG). RAG allows LLMs to access and incorporate external information into their responses. This is typically achieved by querying data stores, such as vector databases, which are optimized for semantic search. Keen illustrates this by drawing a diagram showing an LLM's context window interacting with a vector database to retrieve relevant information, which is then fed back into the LLM to generate a more informed response.
The full discussion can be found on IBM's YouTube channel.
Bridging the Gap: Object, Block, and NAS Storage
The challenge, however, extends beyond simply having data available. AI agents need to interact with various types of storage, including object storage, block storage, and Network Attached Storage (NAS). Keen points out that each of these storage types has distinct APIs and data models. To enable seamless interaction, a unified interface is required. He proposes an 'MCP Server' model, where an MCP (likely standing for Metadata Control Plane or a similar abstraction) acts as a server that interfaces with the underlying storage systems.
The MCP Server architecture
In this architecture, the MCP Host (where the AI agent runs) communicates with the MCP Server using a standardized protocol, such as JSON/RPC. The MCP Server then translates these requests into the specific protocols required by the underlying storage systems (object, block, NAS). This abstraction layer is crucial because it allows the agent to request 'resources' without needing to know the intricacies of each storage type. For instance, an agent might request to 'list directory,' 'read file,' or 'create snapshot,' and the MCP Server handles the translation to the appropriate API calls for the target storage system. This simplifies agent development and allows for greater flexibility in choosing storage backends.
Implementing Safety Layers for Agentic AI
Beyond functionality, Keen emphasizes the critical need for safety layers to ensure the responsible and secure operation of agentic AI. He identifies three key safety layers:
- Immutable Versioning: This ensures that every action or data modification creates a new version, preventing accidental data loss and providing a complete audit trail. It means agents cannot truly delete data, only archive or version it, which is crucial for accountability and recovery.
- Sandboxing: Agents should operate within constrained environments, with access limited to only the specific directories and operations they require. This prevents an agent from accessing or modifying unrelated data, thereby mitigating potential security risks.
- Intent Validation: Before executing a potentially high-impact operation, the system should validate the agent's intent. This involves the agent explaining 'why' it needs to perform an action, allowing for a review and confirmation process to prevent unintended consequences or malicious behavior.
Keen argues that these safety measures are not merely desirable but essential for building trustworthy and reliable AI agents. By implementing these layers, organizations can harness the power of autonomous agents while mitigating the risks associated with their operations.
The Future of Agentic Storage
Keen's presentation underscores a significant shift in how we think about data storage in the age of AI. As agents become more sophisticated and capable of performing complex tasks, the underlying storage infrastructure must evolve to meet their needs. The concept of 'Agentic Storage,' facilitated by abstraction layers like the MCP Server and robust safety protocols, appears to be a vital step towards enabling the full potential of autonomous AI systems.



