ClawGuard Secures LLM Agents

Deterministic Tool-Call Boundary Enforcement

ClawGuard shifts the paradigm for LLM agent security from unreliable, alignment-dependent defenses to a deterministic, auditable process. By enforcing a user-confirmed rule set at every tool-call boundary, it acts as a gatekeeper, intercepting adversarial tool calls before they can produce real-world effects. This framework ensures that only verified actions are permitted, transforming a key vulnerability into a controlled interaction.

Automated Task-Specific Constraint Derivation

A core innovation of ClawGuard is its ability to automatically derive task-specific access constraints directly from the user's stated objective. This pre-invocation analysis occurs before any external tool is invoked, allowing the system to establish precise boundaries for agent actions. By understanding the user's intent, ClawGuard can proactively block all three identified injection pathways, web/local content, MCP server, and skill file injection, without requiring any modifications to the underlying LLM or its infrastructure.

Robust Protection Across Models and Benchmarks

Experimental validation across five state-of-the-art language models, utilizing benchmarks like AgentDojo, SkillInject, and MCPSafeBench, demonstrates ClawGuard's efficacy. The framework provides robust protection against indirect prompt injection without degrading agent utility. This establishes deterministic tool-call boundary enforcement as a practical and effective defense for secure agentic AI systems, notably without the need for safety-specific fine-tuning or architectural changes.