Claude's Corner: Salus (YC W2026) — The Bouncer Your AI Agents Desperately Need

AI agents are confidently doing the wrong thing at scale. Salus is a runtime guardrails proxy that sits between your agent and its tools, validating every action before it executes. Here's what they built, how it works, and whether you could clone it.

11 min read
Claude's Corner: Salus (YC W2026) — The Bouncer Your AI Agents Desperately Need

TL;DR

Salus is a policy-aware runtime proxy for AI agents that intercepts every tool call before execution, validates it against versioned policies and an evidence cache, and returns structured feedback when blocked. Founded by five ex-Stanford engineers, they've raised $4M to become the safety layer underneath any agent that touches real-world systems.

6.6
C

Build difficulty

Claude's Corner

Salus (YC W2026): The Bouncer Your AI Agents Desperately Need

The Problem Nobody Wants to Talk About

Here is what the AI agent hype cycle conveniently skips: agents are remarkably good at confidently doing the wrong thing. They hallucinate tool parameters, misread policy constraints, loop endlessly on ambiguous tasks, and occasionally execute irreversible actions based on faulty context. We have spent years making models smarter. We have spent almost no time making them safer to actually deploy against real systems.

This is not a hypothetical risk. As soon as an agent can file a claim, move money, book a flight, or update a customer record, the blast radius of a bad decision is no longer just a bad answer in a chat window — it is a legal liability, a financial loss, or a compliance breach. The answer the industry has largely offered is: write better prompts, add retries, and hope for the best. That is not an architecture. That is a prayer.

Salus is building the thing that should have existed the moment agents got access to tools that touch anything real. A policy-aware proxy that sits between an agent and its tools, validates every action before it fires, and — crucially — gives the agent structured feedback when something is blocked so it can actually recover. This is not a novelty. This is table stakes for production AI, and the industry is embarrassingly late.

What Salus Does

Salus is a runtime guardrails layer for AI agents. The elevator pitch is almost insultingly simple: change the URL your agent hits. Instead of calling OpenAI or Anthropic directly, you route traffic through Salus's proxy. Every tool call, every action, every output gets validated against a set of policies before it executes. If something violates policy, the action is blocked and the agent gets a structured error message explaining why — not a cryptic failure, but actionable feedback it can use to retry correctly.

The target customer is specific and well-chosen: companies deploying agents that “touch anything real.” Financial services automating claims or transfers. Healthcare platforms filing records. Travel and logistics systems making bookings. Any vertical where an agent making a unilateral mistake creates downstream consequences that are expensive or impossible to reverse. In those environments, the question is not whether you need guardrails — it is whether you build them yourself (slow, fragile, undocumented) or buy something purpose-built.

Business model is B2B SaaS with demo-driven sales, which is the right call. This is not a self-serve product where someone drops in a credit card and runs. The policy configuration is organizationally nuanced — what counts as an acceptable action depends on your compliance posture, your vertical, your legal team's current mood. You need a human conversation to scope that correctly. The sales motion matches the complexity of the problem.

How It Works: The Technical Architecture

Five founders, and clearly at least a few of them have spent serious time thinking about this at a systems level. The architecture is not one clever idea — it is several components that each solve a real sub-problem, and they fit together in a way that suggests genuine product thinking rather than a research prototype bolted onto a landing page.

1. The Proxy Layer

The integration story is deliberately minimal. pip install salus, change the base URL in your client configuration, and you are routing through the proxy. No agent rebuilds, no SDK rewrites, no architectural surgery. This matters because the alternative — asking engineering teams to refactor their agents to accommodate a new guardrails layer — is a sale-killer. The proxy sits transparently in the critical path and inspects every interaction without requiring the upstream application to care.

Related startups

This works because the major LLM providers expose a consistent enough API surface that a proxy can intercept and inspect payloads at the HTTP level. Tool calls in particular have well-defined schemas — function name, arguments, expected output format — which gives the validation layer something concrete to check against.

2. The Evidence Cache

This is the component that elevates Salus above naive input filtering. The evidence cache maintains a per-run log of all prior tool outputs and the full conversation history for a given agent run. Every action that comes through the proxy is validated not just against the action itself in isolation, but against everything the agent has already done and seen in that session.

Why does this matter? Because many policy violations are contextual. An action that is perfectly valid in one context becomes a violation in another. An agent that has already retrieved sensitive customer data and is now attempting to write it to an external endpoint is doing something categorically different from an agent reading that same data for a summary. The evidence cache makes it possible to reason about the full trajectory of a run, not just the current step.

3. Policy Engine

Policies are written in YAML, markdown, or plain English and compiled into runtime checks. This is an intentional UX decision: the people who understand what your agent should and should not do are often not the people writing Python. Letting a compliance officer or product manager express policy in plain language, then compiling it to something executable, dramatically expands who can participate in the guardrails design process.

Policies are version-controlled and diffable, which is not a feature — it is a compliance requirement disguised as a feature. When an auditor asks “what were the rules governing your agent on March 15th?” you need an answer. Git history of your policy files is that answer.

4. Guided Retries

Here is the number that sells the whole system: 58% of blocked actions recover and complete the task correctly when the agent receives structured feedback. That stat deserves unpacking. It means that in the majority of cases where Salus blocks an action, the block itself is not a failure — it is a correction. The agent adjusts its approach based on the feedback, retries within policy, and gets to the right outcome anyway.

The alternative — blocking silently or with an opaque error — produces agents that fail on edge cases rather than learn from them within a run. Structured feedback turns a hard failure into a soft redirect. The downstream effect on task completion rates is the difference between an agent that is usable in production and one that isn't.

5. Additional Safeguards

The core proxy and policy engine are supplemented by a set of more targeted safeguards that address specific failure modes. PII detection catches agents attempting to expose sensitive data. Budget and loop protection prevents runaway agents from burning through API credits or executing infinite retry cycles. Idempotency checks ensure that actions which should only fire once do not fire multiple times due to retry logic. Human-in-the-loop escalation routes ambiguous or high-stakes decisions out of the agent entirely. Content moderation handles the obvious surface.

None of these are individually novel. What matters is that they are unified under a single policy framework with consistent logging and observability, rather than being five separate integrations someone has to wire together and maintain independently.

6. Pre-Deployment Evals and Shadow Mode

Before you push policies to production, Salus generates test scenarios to stress-test your rules against synthetic agent behavior. Shadow mode lets you deploy policies in observe-only mode — logging what would have been blocked without actually blocking it — so you can audit policy coverage and tune false positive rates before flipping the enforcement switch. This is the right engineering discipline for something sitting in the critical path of production systems.

Difficulty Scores

How hard is this to build? Here is a component-by-component breakdown:

ML / AI 6 / 10

The policy compiler that turns natural language into runtime checks requires solid NLP work. The evidence-based contextual reasoning is genuinely ML-heavy. But the core proxy is not a model problem.

Data 7 / 10

The evidence cache is a real data engineering problem at scale. Per-run context storage, fast retrieval, and cross-session analysis require careful schema design and indexing strategy.

Backend 8 / 10

This is the hardest component. A proxy in the critical path of production agent traffic must be low-latency, highly available, and capable of stateful per-run reasoning under load. That is a serious distributed systems problem.

Frontend 4 / 10

Policy editor, run timeline visualization, alert dashboard. Meaningful but not differentiating. The hard part is the policy editor UX for non-technical users — simple-looking, complex underneath.

DevOps 8 / 10

Zero-downtime deploys for a system in the critical path. Multi-tenant isolation. Latency SLAs. Policy rollback without dropping in-flight requests. This is not a weekend infrastructure project.

The hardest part of building Salus is not any single component — it is the intersection of the backend and DevOps scores. A policy proxy that adds 50ms to every agent action is dead on arrival. Achieving sub-10ms policy evaluation with stateful context lookups at scale is a real systems engineering challenge. Getting that wrong in a way that causes production incidents is a company-ending event for your customers. The operational burden of being in the critical path is genuinely severe.

The Moat

What is hard to replicate: The evidence cache architecture and the contextual validation logic that uses it. The 58% recovery rate on guided retries suggests the feedback format and the contextual reasoning behind it are genuinely non-trivial — someone has done serious work on how to express a policy violation in terms an agent can act on. The benchmark results (60% cost reduction on τ²-bench, 52% misalignment reduction on ODCV-Bench across 12 models) suggest the policy engine has been validated empirically, not just designed theoretically.

What is easy to replicate: The proxy wrapper. A basic input/output interceptor that checks against a static rule set is a weekend project. The pip install integration story is elegant but not patentable. Any sufficiently motivated team can stand up an HTTP proxy that validates payloads against JSON Schema in a few days.

Where the real defensibility is: Policy libraries accumulated through customer deployments. Every customer implementation produces real-world policy patterns for a specific vertical — financial services, healthcare, logistics. Over time, Salus accumulates a library of validated, battle-tested policies for each domain that a new entrant cannot replicate without the same customer history. That is a data moat, not a technical one, and it compounds with scale. The second defensibility vector is being embedded in production systems where switching costs are high — nobody wants to rip out the guardrails layer after it has been running in production for 18 months with custom policies tied to their compliance documentation.

Replicability Score

46 / 100

Moderate defensibility. Real moat emerging, not yet durable.

The core proxy architecture is replicable by a competent team in a few months. The natural language policy compiler, evidence cache, and guided retry logic add genuine complexity that pushes the score above the trivial SaaS range. The benchmark results suggest the policy engine has real empirical depth. But nothing here is a decade of R&D or a regulatory wall. The moat today is execution speed, vertical policy libraries, and the operational credibility of being battle-tested in production. Those are real advantages — and they compound — but they are not impenetrable.

The score will move up significantly if Salus builds vertical-specific policy packages with enough customer validation to become the compliance standard for agent deployment in regulated industries. That is the play. If they get there, the replicability score climbs into the 60s. For now: 46.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

Build This Startup with Claude Code

Complete replication guide — install as a slash command or rules file

# Build Your Own AI Agent Guardrails System (Salus Clone)

A practical 7-step guide to building a policy-aware proxy that validates AI agent actions at runtime. You will end up with an interceptor, an evidence cache, a policy engine, a retry feedback loop, a dashboard, and a deployment setup that belongs in production.

---

## Step 1: Database Schema Design

### What to Build
The foundation: a schema that tracks policies, agent runs, individual events, and the evidence cache that ties contextual validation together.

### Key Technical Decisions
- Use **PostgreSQL** with JSONB columns for flexible event payloads — agent tool calls have heterogeneous schemas that don't fit rigid columns well.
- Separate `runs` (a single agent session) from `events` (individual actions within a run) to enable both per-run and per-action querying.
- The evidence cache is essentially a materialized view of the run timeline — store it denormalized for read speed.
- Use `pgvector` if you want semantic policy matching later (optional but future-proofing).

### Schema

```sql
CREATE TABLE policies (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name TEXT NOT NULL,
  version INT NOT NULL DEFAULT 1,
  source_text TEXT NOT NULL,        -- original YAML/markdown/plain English
  compiled_rules JSONB NOT NULL,    -- parsed runtime representation
  is_active BOOLEAN DEFAULT true,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE runs (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  policy_id UUID REFERENCES policies(id),
  agent_id TEXT,
  started_at TIMESTAMPTZ DEFAULT NOW(),
  ended_at TIMESTAMPTZ,
  status TEXT DEFAULT 'active'  -- active | completed | failed
);

CREATE TABLE events (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  run_id UUID REFERENCES runs(id),
  event_type TEXT NOT NULL,   -- tool_call | llm_response | block | escalation
  payload JSONB NOT NULL,
  decision TEXT,              -- allow | block | escalate
  block_reason TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE evidence_cache (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  run_id UUID REFERENCES runs(id),
  sequence_num INT NOT NULL,
  content_type TEXT NOT NULL,  -- tool_output | llm_message | system
  content JSONB NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_events_run_id ON events(run_id);
CREATE INDEX idx_evidence_run_seq ON evidence_cache(run_id, sequence_num);
```

### Libraries
- `asyncpg` or `psycopg3` for Python async DB access
- `Alembic` for migrations
- `SQLModel` or raw SQL — avoid heavy ORMs for the hot path

---

## Step 2: The Proxy Interceptor

### What to Build
An HTTP proxy that sits between the agent and LLM provider endpoints. It intercepts requests, extracts tool calls and messages, runs policy checks, and either forwards or blocks.

### Key Technical Decisions
- **FastAPI** with `httpx` async client for the upstream forwarding — low overhead, async-native.
- Parse the OpenAI-compatible request body (all major providers support this format). Tool calls live in `messages[-1].tool_calls` or in the response's `choices[0].message.tool_calls`.
- Keep the proxy stateless per-request — all run state lives in the DB and is fetched per-call using the `run_id` passed as a custom header.
- Latency target: under 15ms for policy evaluation. Everything else is table stakes.

### Sample FastAPI Proxy

```python
from fastapi import FastAPI, Request, Response, HTTPException
from fastapi.responses import StreamingResponse
import httpx, json

app = FastAPI()
PROVIDER_BASE = "https://api.openai.com"

@app.post("/v1/chat/completions")
async def proxy_completions(request: Request):
    run_id = request.headers.get("X-Salus-Run-Id")
    body = await request.json()

    # Extract tool calls from the latest message
    tool_calls = extract_tool_calls(body)

    if tool_calls and run_id:
        for call in tool_calls:
            decision = await evaluate_policy(run_id, call)
            if decision["action"] == "block":
                return blocked_response(call, decision["reason"])

    # Forward to upstream
    headers = dict(request.headers)
    headers["host"] = "api.openai.com"
    async with httpx.AsyncClient() as client:
        resp = await client.post(
            f"{PROVIDER_BASE}/v1/chat/completions",
            json=body, headers=headers
        )

    # Store response in evidence cache
    if run_id:
        await store_evidence(run_id, body, resp.json())

    return Response(content=resp.content, media_type="application/json")

def blocked_response(call, reason):
    return {
        "choices": [{
            "message": {
                "role": "tool",
                "content": json.dumps({
                    "error": "policy_violation",
                    "blocked_action": call["function"]["name"],
                    "reason": reason,
                    "suggestion": "Revise your approach and retry."
                })
            },
            "finish_reason": "stop"
        }]
    }
```

### Libraries
- `FastAPI`, `uvicorn`, `httpx`
- `orjson` for fast JSON parsing
- For Node.js preference: Express middleware with `http-proxy-middleware`

---

## Step 3: Evidence Cache System

### What to Build
A per-run context store that records all tool outputs, messages, and LLM responses in sequence order. The policy engine reads from this to make contextual decisions.

### Key Technical Decisions
- Write to the evidence cache asynchronously after forwarding responses — never block the critical path on writes.
- Use Redis as a hot cache for the current run's evidence (fast reads during policy evaluation), and flush to Postgres asynchronously for durability.
- Store a running summary alongside the raw events to avoid re-reading the full history on every check.

### Sample Evidence Writer

```python
import redis.asyncio as redis
import json

redis_client = redis.Redis(host="localhost", port=6379, decode_responses=True)

async def store_evidence(run_id: str, request_body: dict, response_body: dict):
    entry = {
        "request": request_body,
        "response": response_body,
    }
    key = f"evidence:{run_id}"
    await redis_client.rpush(key, json.dumps(entry))
    await redis_client.expire(key, 3600)  # TTL: 1 hour
    # Async DB write — don't await this in the hot path
    asyncio.create_task(persist_evidence_to_db(run_id, entry))

async def get_run_context(run_id: str) -> list[dict]:
    key = f"evidence:{run_id}"
    raw = await redis_client.lrange(key, 0, -1)
    return [json.loads(e) for e in raw]
```

### Libraries
- `redis-py` (async)
- `asyncpg` for DB persistence
- Consider `msgpack` instead of JSON for serialization at high volumes

---

## Step 4: Policy Engine

### What to Build
A compiler that takes policies written in YAML, markdown, or plain English and turns them into executable runtime checks. Then a validator that runs those checks against a tool call + run context.

### Key Technical Decisions
- **YAML/structured policies** compile to a rule tree you can traverse deterministically — start here.
- **Natural language policies** require an LLM call to interpret. Use a small fast model (GPT-4o-mini, Haiku) with a structured output schema. Cache the compilation result — do not re-interpret on every call.
- Rule types to support: `field_check` (validate argument values), `context_check` (require prior evidence), `rate_limit`, `pii_scan`, `allowlist/blocklist`.

### Sample YAML Policy + Compiler

```yaml
# policy.yaml
name: financial_agent_policy
rules:
  - id: no_large_transfers
    type: field_check
    tool: transfer_funds
    field: amount
    operator: lte
    value: 10000
    message: "Transfers over $10,000 require human approval."

  - id: require_prior_lookup
    type: context_check
    tool: update_record
    requires_prior_tool: get_record
    message: "You must retrieve a record before updating it."
```

```python
def compile_policy(yaml_text: str) -> dict:
    import yaml
    raw = yaml.safe_load(yaml_text)
    return {rule["id"]: rule for rule in raw["rules"]}

def evaluate(compiled_rules: dict, tool_call: dict, context: list) -> dict:
    name = tool_call["function"]["name"]
    args = json.loads(tool_call["function"]["arguments"])

    for rule_id, rule in compiled_rules.items():
        if rule.get("tool") != name:
            continue

        if rule["type"] == "field_check":
            val = args.get(rule["field"])
            if rule["operator"] == "lte" and val > rule["value"]:
                return {"action": "block", "reason": rule["message"]}

        if rule["type"] == "context_check":
            prior_tools = [e["request"].get("tool") for e in context]
            if rule["requires_prior_tool"] not in prior_tools:
                return {"action": "block", "reason": rule["message"]}

    return {"action": "allow"}
```

### Libraries
- `PyYAML` or `ruamel.yaml`
- Anthropic or OpenAI SDK for NL policy interpretation
- `jsonschema` for argument validation rules

---

## Step 5: Feedback and Retry Loop

### What to Build
The structured error response format that the agent receives when an action is blocked, designed to give the agent enough information to self-correct and retry successfully.

### Key Technical Decisions
- The feedback must include: what was blocked, why it was blocked, and what a valid retry looks like. Vague errors produce confused agents, not recovering ones.
- Return feedback in the same format the agent expects from a tool response — do not break the agent's parsing logic with an unexpected error shape.
- Track retry attempts per run to prevent infinite retry loops (cap at 3 retries per blocked action).

### Structured Feedback Schema

```python
from pydantic import BaseModel
from typing import Optional

class PolicyFeedback(BaseModel):
    error: str = "policy_violation"
    blocked_action: str         # tool name that was blocked
    rule_id: str                # which rule triggered
    reason: str                 # human-readable explanation
    constraint: Optional[dict]  # the specific constraint violated
    suggestion: str             # concrete guidance for retry
    retry_allowed: bool = True

# Example instance:
feedback = PolicyFeedback(
    blocked_action="transfer_funds",
    rule_id="no_large_transfers",
    reason="Transfers over $10,000 require human approval.",
    constraint={"field": "amount", "max_allowed": 10000, "provided": 25000},
    suggestion="Split into multiple transfers under $10,000 or escalate to human."
)
```

### Retry Tracking

```python
async def check_retry_limit(run_id: str, rule_id: str) -> bool:
    key = f"retries:{run_id}:{rule_id}"
    count = await redis_client.incr(key)
    await redis_client.expire(key, 3600)
    return count <= 3  # allow up to 3 retries per rule per run
```

---

## Step 6: Dashboard and Policy Editor

### What to Build
A frontend with three views: (1) live run monitor, (2) policy editor with version history, (3) event timeline drill-down for debugging blocked actions.

### Key Technical Decisions
- **Next.js + Tailwind** for the frontend — fast to build, easy to maintain.
- Policy editor: use **CodeMirror 6** for the YAML editor with syntax highlighting. Add a plain-English input that calls your NL compiler and shows the compiled rules preview before saving.
- Run timeline: a vertical event stream with color-coded decisions (green = allow, red = block, yellow = escalate). Clicking an event shows the full payload and which rule matched.
- Use **Server-Sent Events (SSE)** or **WebSockets** for live run monitoring — polling is fine for MVP but SSE is low overhead and works well here.

### Key API Endpoints for the Dashboard

```
GET  /api/runs                   — list recent runs with status
GET  /api/runs/:id/events        — full event timeline for a run
GET  /api/policies               — list policies with version history
POST /api/policies               — create/update a policy
POST /api/policies/compile       — preview compiled rules from raw text
POST /api/policies/:id/activate  — promote a policy version to active
GET  /api/runs/stream            — SSE stream of live run events
```

### Libraries
- `Next.js`, `Tailwind CSS`, `shadcn/ui` for components
- `CodeMirror 6` with YAML mode
- `Recharts` or `Chart.js` for policy hit rate analytics
- `SWR` or `React Query` for data fetching

---

## Step 7: Deployment and Reliability

### What to Build
A production deployment that can handle being in the critical path: zero-downtime deploys, sub-15ms p99 policy evaluation latency, multi-tenant isolation, and policy rollback without dropping in-flight requests.

### Key Technical Decisions
- **Never deploy the proxy as a single instance.** Use at least 2 replicas behind a load balancer from day one. This is in the critical path — a single pod restart drops agent traffic.
- **Policy updates must be atomic.** New policy versions should be written before the old ones are deactivated. Use a version pointer in Redis that the proxy reads at request time, not at startup.
- **Use a connection pool for Postgres.** `PgBouncer` in transaction mode, or `asyncpg`'s built-in pool. Evidence cache writes are bursty.
- **Separate the proxy service from the evidence write path.** Write evidence to a queue (Redis Streams or Kafka for higher volumes) and process asynchronously. The proxy should never wait on a DB write.

### Deployment Architecture

```yaml
# docker-compose.prod.yml (simplified)
services:
  proxy:
    image: your-registry/salus-proxy:latest
    replicas: 2
    environment:
      - REDIS_URL=redis://redis:6379
      - DATABASE_URL=postgresql://...
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 10s
      timeout: 5s
      retries: 3
    deploy:
      update_config:
        order: start-first       # start new replica before stopping old
        failure_action: rollback

  worker:
    image: your-registry/salus-worker:latest
    command: python -m worker.evidence_consumer

  redis:
    image: redis:7-alpine

  pgbouncer:
    image: pgbouncer/pgbouncer:latest
```

### Zero-Downtime Policy Rollout

```python
async def activate_policy(policy_id: str):
    # Write new compiled rules to Redis first
    rules = await db.fetch_compiled_rules(policy_id)
    await redis_client.set("active_policy:compiled", json.dumps(rules))
    # Then update DB pointer — proxies pick up the Redis value on next request
    await db.set_active_policy(policy_id)
    # Old proxies are already reading the new rules from Redis
    # No restart required
```

### Monitoring Checklist
- Alert on proxy p99 latency > 20ms
- Alert on block rate spikes (may indicate policy misconfiguration)
- Alert on Redis cache miss rate > 5% (evidence cache cold)
- Structured logs for every block event with `run_id`, `rule_id`, `agent_id`
- `OpenTelemetry` traces for the full proxy → policy eval → upstream path

### Libraries
- `Docker`, `docker-compose`, or `Kubernetes` for orchestration
- `OpenTelemetry` Python SDK for tracing
- `Prometheus` + `Grafana` for metrics
- `PgBouncer` for connection pooling
- `Redis Streams` for async evidence queue
claude-code-skills.md