AI Agents Spill Your Private Data

AI agents, designed to act on our behalf online by automating tasks, are inadvertently exposing personal data users expect to remain private. A new research project from Brave, titled SPILLAGE, highlights the significant privacy risks associated with these powerful tools.

As agents navigate live websites using user credentials and personal data, critical questions arise about how they handle sensitive information and whether user privacy expectations are met. The research probes whether privacy is an afterthought or a fundamental requirement for trustworthy agentic operations.

The Rise of the Digital Assistant

LLM-powered agents fulfill a long-held desire for digital assistants capable of handling daily tasks. Unlike simple chatbots, these agents can autonomously plan and execute sequences of actions, acting as a true extension of the user in the digital realm.

The web, with its constant stream of user interactions, is the natural environment for these agents. They promise to transform the web from a space of manual navigation into one of intelligent automation, handling everything from booking flights to comparing products.

Privacy Stakes Skyrocket

To perform tasks, web agents require access to a user's personal resources, including emails, calendars, and account credentials. This access creates a substantial privacy surface, as sensitive information is shared not only with the agent but potentially with every third-party website it interacts with.

As users delegate more online activity to agents, privacy risks compound. The agent becomes a central point of data exposure, aggregating and transmitting personal information at an unprecedented scale. This research highlights the AI agent privacy risks that require urgent attention.

One striking example observed was Perplexity Comet copying user conversation histories directly into third-party search interfaces, leading to the disclosure of sensitive personal information. This LLM agent data disclosure underscores the need for robust safeguards.

Natural Agentic Oversharing Unveiled

Web agents operate 'in the wild,' leaving observable traces of their actions. Every query, form submission, and click can inadvertently share more user information than necessary for task completion. Brave terms this phenomenon 'Natural Agentic Oversharing.'

This concept extends the idea of human oversharing to autonomous AI agents. The SPILLAGE framework categorizes this oversharing along two axes: directness (explicit vs. implicit) and channel (content vs. behavior).

For instance, an agent searching for glucose tests might inadvertently reveal a user is divorced through explicit text field entries or implicit behavioral patterns like specific clicks or form choices observed over time. This highlights the pervasive nature of agentic oversharing.

Oversharing is Pervasive, Mitigation is Complex

Brave's research evaluated two open-source agentic frameworks across three LLMs on e-commerce sites. The findings were stark: oversharing is pervasive, with behavioral oversharing consistently exceeding content oversharing.

Crucially, instructing agents to be privacy-conscious via prompts proved insufficient. This suggests that simple prompt-level mitigation cannot address the deep-seated issues driving natural agentic oversharing.

Privacy and Utility Are Allies, Not Enemies

A common assumption is that privacy and utility are in conflict. However, Brave's research challenges this notion. When task-irrelevant information was manually removed from an agent's input, task success actually improved by up to 17.9%.

This indicates that reducing oversharing does not hinder agent performance; it enhances it. Privacy and utility are, therefore, mutually reinforcing. Building privacy-aware web agent security is not a limitation but a performance enhancer.

Brave is actively working to identify and mitigate these privacy risks in its own agents. Users can test an early version of agentic browsing in the Brave Leo AI assistant within Brave Nightly.