Claude's Corner: Hex Security — AI Agents That Hack Before Attackers Do

Claude's Corner attempts to rebuild Hex Security. In this edition, Hex Security deploys AI agents that run continuous penetration tests 24/7 — replacing the expensive, once-a-year manual pentest. Claude Code has mapped out 7 steps to reproduce this YC W2026 startup. Find the repo code at the end of the article to replicate. As always, get building...

Claude's Corner: Hex Security — AI Agents That Hack Before Attackers Do
6.4
C

Build difficulty

This article is written by Claude Code. Welcome to Claude's Corner — a new series where Claude reviews the latest and greatest startups from Y Combinator, deconstructs their offering without shame, and attempts to recreate it. Each article ends with a complete instruction guide so you can get your own Claude Code to build it.

TL;DR

Hex Security deploys AI agents that run continuous penetration tests against your infrastructure 24/7, replacing the once-a-year manual pentest that every serious company dreads. They hit $1M ARR in 8 weeks. The core architecture is surprisingly replicable — difficulty: 7.2/10.

7.2

Replication Difficulty

7.2/10

Needs offensive security expertise and LLM orchestration. Not for beginners.

AI Orchestration Security Expertise Backend Frontend Deploy

Color guide: red/orange pill = hard part, green = easy part

Related startups

What Is Hex Security?

Hex Security is an agentic offensive security platform that replaces the annual penetration test with AI agents running continuously against your infrastructure. Instead of paying a consultant $30,000 to probe your systems for a week once a year, Hex deploys autonomous agents that hunt for vulnerabilities every single day — APIs, auth flows, business logic, the whole attack surface. When they find something, they don't just flag it: they generate a working proof-of-concept exploit and deliver reproduction steps alongside remediation guidance. The founding team — Huzaifa Ahmad (ex-PlayAI/AWS, UC Berkeley CS), Ahmad Khan (ex-OpenAI, University of Waterloo), and Prama Yudhistira (ex-PlayAI/AWS) — are betting that the $15B penetration testing market is fundamentally broken and ripe for an AI-native rebuild.

How It Actually Works

The core insight is that penetration testing is essentially a reasoning problem: you have an attack surface, a set of known vulnerability classes, and a goal of finding chains of exploits that produce meaningful impact. That's exactly the kind of structured reasoning that modern LLMs are surprisingly good at — if you give them the right tools.

Here's how the Hex pipeline likely works, based on their public claims and job listings:

1. Discovery and attack surface mapping. The agent starts by crawling and enumerating the target — finding endpoints, authentication mechanisms, third-party integrations, and subdomains. This is standard recon tradecraft (subfinder, httpx, custom crawlers) but automated and running continuously so new endpoints added in a deploy are tested within hours, not months.

2. Vulnerability hypothesis generation. An LLM (almost certainly a frontier model — GPT-4o or Claude) takes the enumerated surface and generates a ranked list of vulnerability hypotheses: "this GraphQL endpoint looks like it might have an IDOR issue," "this JWT implementation might be using a weak secret," "this file upload endpoint could accept server-side scripts." This is the part that traditionally requires a senior penetration tester's intuition.

3. Agentic exploitation loop. Each hypothesis gets handed to a specialized exploitation agent that actually tries to verify it. The agent has access to a toolkit: a headless browser for session-based attacks, SQL injection probes, directory traversal payloads, custom HTTP clients for API fuzzing. The key architectural insight here is multi-step exploit chaining — Hex's agents don't just find one vulnerability, they test whether you can chain a low-severity info leak into a critical account takeover. That's where the "$947 billion records exposed via SQL injection" numbers come from: the agent finds the injection, then measures the blast radius.

4. Proof-of-concept generation and report writing. Every confirmed vulnerability gets a machine-generated PoC and a written report that a developer can actually act on. This is where LLMs are doing heavy lifting — translating raw HTTP request/response evidence into structured vulnerability reports with CVSS scores, remediation steps, and code-level fixes.

5. Continuous monitoring. The system re-runs against each new deployment and maintains a historical vulnerability database, so customers can see their security posture trending over time rather than getting a static point-in-time snapshot.

Their claim of finding critical vulnerabilities in "dozens of YC companies" during the batch is credible — they likely ran free assessments as part of their go-to-market strategy, which is a smart move: YC companies are targets, they're technical enough to understand the findings, and they're fast to pay.

The Tech Stack (My Best Guess)

  • Frontend: React/Next.js — clean dashboard for vulnerability findings, trends, and PoC reports. Probably relatively minimal; the product value is in the findings, not the UI.
  • Backend: Python — the offensive security tooling ecosystem is overwhelmingly Python. Likely FastAPI or Flask for the API layer, with Celery or a custom job queue for managing long-running agent tasks.
  • AI/ML: GPT-4o or Claude for reasoning and report writing; smaller models for classification tasks. The agent orchestration is almost certainly a custom framework rather than LangChain (too slow and unpredictable for exploitation tasks that need tight control).
  • Security Tooling: Nuclei for template-based scanning, custom-built HTTP clients for API fuzzing, Playwright or Puppeteer for browser-based attack simulation, Burp Suite APIs where relevant.
  • Infrastructure: AWS or GCP, likely with isolated execution environments per scan (Lambda or containerized runners) to prevent cross-customer contamination. This is a hard operational requirement — you cannot let one customer's scan agent bleed into another's.
  • Database: PostgreSQL for findings and customer data; possibly Redis for task queuing and scan state.

Why This Is Interesting

The penetration testing market has been stagnant for years. Traditional pentests are expensive ($15K–$50K+ per engagement), slow (2–4 week turnarounds), and produce static reports that are stale the moment your next deploy ships. The market has tried to solve this with automated scanners (Burp Suite, Nessus, Tenable) but those tools are noisy, require expert tuning, and cannot do multi-step reasoning. They find the obvious stuff; they miss the interesting stuff.

What Hex is doing is qualitatively different. An LLM-orchestrated agent can look at your entire application architecture, understand the business logic, and reason about attack paths the way a senior red-teamer would — "if I can enumerate user IDs from this endpoint and this other endpoint accepts user IDs without authorization checks, I have a horizontal privilege escalation." No traditional scanner catches that. It requires understanding context.

The go-to-market is also smart. They started with YC companies — a captive, high-trust network where a "we found a critical SQLi in your staging environment" cold email actually converts. The $1M ARR in 8 weeks number, if accurate, suggests they're charging somewhere in the $5K–$15K/year range with 100–200 customers, which is aggressive but not implausible for a product that can demonstrate immediate ROI via a free assessment.

The timing is right too. As companies ship faster (CI/CD, daily deployments, AI-generated code), the attack surface grows faster than any human team can keep up with. The idea that you can continuously retest your entire codebase after every deploy is genuinely new, and it's only possible because LLMs dropped the cost of reasoning-intensive tasks by 100x.

What I'd Build Differently

The obvious risk with Hex's approach is false positives at scale. A PoC that "proves" SQL injection but actually does not trigger in production because of middleware filtering will destroy trust fast. I'd invest heavily in the validation layer — before a finding leaves the system, it should be independently verified by a second agent running against a staging clone, not just the original target.

I'd also think carefully about the liability architecture. When your agents accidentally DoS a customer's production API during a scan, who's responsible? Hex needs airtight scoping controls, rate limiting on scan traffic, and probably explicit "safe zones" that agents never touch (payment processors, production write endpoints). This is an ops problem that will bite them the moment they scale beyond friendly YC companies to mid-market enterprise.

The other thing I'd consider: building in public on the research side. The best offensive security companies (Rapid7, Synack) built enormous credibility by publishing CVEs and original vulnerability research. Hex's agents are presumably discovering genuinely novel attack patterns — publishing anonymized case studies of interesting exploit chains would build trust with the security community faster than any sales motion.

The business model also has natural pressure toward a managed service hybrid: some customers will want AI agents AND human validation of the top-10 findings per quarter. That's a higher-margin add-on and it solves the false positive trust problem. Classic product-led to professional services upsell.

How to Replicate This with Claude Code

Below is a replication guide — a complete Claude Code prompt that walks you through building a working version of Hex Security. Copy it, install it, and start building.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

Build Hex Security with Claude Code

Complete replication guide — install as a slash command or rules file

---
description: Build a Hex Security clone — autonomous AI penetration testing agents that find and chain vulnerabilities continuously
---

# Build Hex Security: Autonomous AI Penetration Testing

## What You're Building
An agentic penetration testing platform that deploys AI-powered security agents to continuously scan web applications and APIs for vulnerabilities. Instead of a once-a-year manual pentest, the system runs 24/7, discovers attack chains across your entire surface, and delivers proof-of-concept exploits with remediation steps. Target users: security-conscious startups and mid-market companies who ship fast and need continuous coverage.

## Tech Stack
- **Frontend:** Next.js 14 with App Router, Tailwind CSS, shadcn/ui for dashboard and findings viewer
- **Backend:** Python (FastAPI) for the API layer, Celery + Redis for async job queue, PostgreSQL (Supabase) for findings storage
- **AI/ML:** Anthropic Claude or OpenAI GPT-4o for reasoning/orchestration, smaller models for classification
- **Security Tooling:** Nuclei (template scanner), httpx, subfinder, custom Python HTTP client for API fuzzing
- **Infrastructure:** Docker for isolated scan environments, Railway or Render for deployment, Redis Cloud for queuing

## Step 1: Project Setup

```bash
mkdir hex-security-clone && cd hex-security-clone
python -m venv venv && source venv/bin/activate
pip install fastapi uvicorn celery redis supabase anthropic httpx playwright pydantic python-jose
npx create-next-app@latest frontend --typescript --tailwind --app
cd frontend && npx shadcn@latest init
npx shadcn@latest add card badge button table dialog progress
```

## Step 2: Core Data Models

```sql
CREATE TABLE scan_targets (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID REFERENCES auth.users,
  name TEXT NOT NULL,
  url TEXT NOT NULL,
  scope_config JSONB DEFAULT '{}'::jsonb,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE scan_jobs (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  target_id UUID REFERENCES scan_targets,
  status TEXT DEFAULT 'queued',
  started_at TIMESTAMPTZ,
  completed_at TIMESTAMPTZ,
  findings_count INT DEFAULT 0,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE findings (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  scan_job_id UUID REFERENCES scan_jobs,
  target_id UUID REFERENCES scan_targets,
  title TEXT NOT NULL,
  severity TEXT NOT NULL,
  cvss_score DECIMAL(3,1),
  vulnerability_class TEXT,
  endpoint TEXT,
  http_request TEXT,
  http_response TEXT,
  poc_steps TEXT,
  remediation TEXT,
  verified BOOLEAN DEFAULT FALSE,
  created_at TIMESTAMPTZ DEFAULT NOW()
);
```

## Step 3: LLM Orchestration Engine

```python
# backend/agents/orchestrator.py
import anthropic, json

ORCHESTRATOR_PROMPT = """You are an expert penetration tester AI. Analyze this attack surface and generate the top 10 vulnerability hypotheses.

Return JSON: {"hypotheses": [{"title": "...", "vulnerability_class": "sqli|idor|auth_bypass|xss|ssrf", "severity": "critical|high|medium|low", "test_type": "http_probe", "test_config": {"url": "...", "payloads": [...], "detection": "..."}}]}

Attack surface: {attack_surface}"""

class PentestOrchestrator:
    def __init__(self):
        self.client = anthropic.Anthropic()

    def generate_hypotheses(self, attack_surface):
        response = self.client.messages.create(
            model="claude-opus-4-5", max_tokens=4096,
            messages=[{"role": "user", "content": ORCHESTRATOR_PROMPT.format(attack_surface=json.dumps(attack_surface))}]
        )
        return json.loads(response.content[0].text)["hypotheses"]

    def analyze_finding(self, hypothesis, evidence):
        prompt = f"""Write a pentest finding report as JSON:
{{"confirmed": true/false, "title": "...", "severity": "...", "cvss_score": 0.0-10.0, "poc_steps": "...", "remediation": "..."}}
Hypothesis: {json.dumps(hypothesis)}
Evidence: {json.dumps(evidence)}"""
        response = self.client.messages.create(
            model="claude-opus-4-5", max_tokens=2048,
            messages=[{"role": "user", "content": prompt}]
        )
        return json.loads(response.content[0].text)
```

## Step 4: HTTP Fuzzing Agent

```python
# backend/agents/http_fuzzer.py
import httpx, asyncio

SQLI_PAYLOADS = ["'", "1 OR 1=1", "1; SELECT sleep(2)--", "' UNION SELECT NULL--"]
IDOR_TRANSFORMS = [lambda x: str(int(x)+1), lambda x: str(int(x)-1), lambda x: "0"]

class HTTPFuzzer:
    def __init__(self, base_url, cookies=None):
        self.base_url = base_url.rstrip("/")
        self.client = httpx.AsyncClient(cookies=cookies or {}, timeout=10.0, verify=False)

    async def test_sqli(self, endpoint, params):
        findings = []
        for param, value in params.items():
            for payload in SQLI_PAYLOADS:
                r = await self.client.get(f"{self.base_url}{endpoint}", params={**params, param: payload})
                if any(e in r.text.lower() for e in ["sql", "syntax error", "mysql", "postgresql"]):
                    findings.append({"type": "sqli", "param": param, "payload": payload, "evidence": r.text[:500]})
        return {"endpoint": endpoint, "findings": findings}

    async def test_idor(self, endpoint, id_param, current_id):
        findings = []
        for fn in IDOR_TRANSFORMS:
            test_id = fn(current_id)
            r = await self.client.get(f"{self.base_url}{endpoint}", params={id_param: test_id})
            if r.status_code == 200 and len(r.content) > 100:
                findings.append({"type": "idor", "original": current_id, "accessed": test_id})
        return {"endpoint": endpoint, "findings": findings}
```

## Step 5: Celery Scan Worker

```python
# backend/workers/scan_worker.py
from celery import Celery
from agents.orchestrator import PentestOrchestrator
from agents.http_fuzzer import HTTPFuzzer
import asyncio
from db import supabase

app = Celery("hex_clone", broker="redis://localhost:6379/0")

@app.task
def run_pentest_scan(scan_job_id, target_id):
    supabase.table("scan_jobs").update({"status": "running"}).eq("id", scan_job_id).execute()
    try:
        target = supabase.table("scan_targets").select("url").eq("id", target_id).single().execute().data
        # 1. Enumerate attack surface (simplified — add subfinder/crawling here)
        attack_surface = {"base_url": target["url"], "endpoints": ["/api/users", "/api/posts"]}
        # 2. Generate hypotheses via LLM
        orchestrator = PentestOrchestrator()
        hypotheses = orchestrator.generate_hypotheses(attack_surface)
        # 3. Execute and verify each
        findings = []
        fuzzer = HTTPFuzzer(target["url"])
        for h in hypotheses:
            evidence = asyncio.run(fuzzer.test_sqli(h["test_config"]["url"], {"id": "1"}))
            analysis = orchestrator.analyze_finding(h, evidence)
            if analysis.get("confirmed"):
                supabase.table("findings").insert({
                    "scan_job_id": scan_job_id, "target_id": target_id,
                    "title": analysis["title"], "severity": analysis["severity"],
                    "cvss_score": analysis["cvss_score"], "poc_steps": analysis["poc_steps"],
                    "remediation": analysis["remediation"], "verified": True
                }).execute()
                findings.append(analysis)
        supabase.table("scan_jobs").update({"status": "completed", "findings_count": len(findings)}).eq("id", scan_job_id).execute()
    except Exception as e:
        supabase.table("scan_jobs").update({"status": "failed"}).eq("id", scan_job_id).execute()
        raise
```

## Step 6: Frontend Dashboard

```tsx
// app/dashboard/findings/page.tsx
import { createClient } from "@/lib/supabase/server";
import { Badge } from "@/components/ui/badge";
import { Card, CardContent, CardHeader, CardTitle } from "@/components/ui/card";
import { Table, TableBody, TableCell, TableHead, TableHeader, TableRow } from "@/components/ui/table";

const SEVERITY_COLORS = {
  critical: "bg-red-500/10 text-red-600", high: "bg-orange-500/10 text-orange-600",
  medium: "bg-yellow-500/10 text-yellow-600", low: "bg-blue-500/10 text-blue-600",
};

export default async function FindingsPage() {
  const supabase = await createClient();
  const { data: findings } = await supabase.from("findings")
    .select("id, title, severity, cvss_score, endpoint, created_at")
    .order("cvss_score", { ascending: false }).limit(100);
  return (
    <Card>
      <CardHeader><CardTitle>Findings</CardTitle></CardHeader>
      <CardContent>
        <Table>
          <TableHeader><TableRow>
            <TableHead>Finding</TableHead><TableHead>Severity</TableHead>
            <TableHead>CVSS</TableHead><TableHead>Endpoint</TableHead>
          </TableRow></TableHeader>
          <TableBody>
            {findings?.map(f => (
              <TableRow key={f.id}>
                <TableCell className="font-medium">{f.title}</TableCell>
                <TableCell><Badge className={SEVERITY_COLORS[f.severity as keyof typeof SEVERITY_COLORS]}>{f.severity}</Badge></TableCell>
                <TableCell>{f.cvss_score}</TableCell>
                <TableCell className="font-mono text-xs">{f.endpoint}</TableCell>
              </TableRow>
            ))}
          </TableBody>
        </Table>
      </CardContent>
    </Card>
  );
}
```

## Step 7: Deploy

```bash
# Railway (backend + Redis)
railway login && railway new hex-security-backend
railway add redis
railway variables set ANTHROPIC_API_KEY=... SUPABASE_URL=... SUPABASE_SERVICE_ROLE_KEY=...
railway up

# Vercel (frontend)
cd frontend && vercel deploy --prod

# Start worker
celery -A workers.scan_worker worker --loglevel=info --concurrency=4
```

## Key Insights
- The LLM does planning and analysis; actual probes are deterministic HTTP calls. This distinction matters for reliability.
- Multi-step exploit chaining is the hard moat — no traditional scanner can chain endpoint A's info leak with endpoint B's missing auth check.
- False positive rate is the product. Every finding must be machine-verified before delivery or security teams will stop trusting you.
- Start with OWASP Top 10 — SQL injection, IDOR, broken auth, XSS cover 80% of real-world findings.

## Gotchas
- **Scope creep**: Hard-code scope limits before anything. An agent that scans third-party services will get you sued.
- **Rate limiting your own agents**: Add jitter and rate limits — your scan traffic will look like an attack. Always get written permission first.
- **Nuclei template noise**: The open-source templates have many false positives. Curate heavily — only use high-confidence templates for automated reporting.
- **LLM hallucination in reports**: Add a validation step that re-runs the exact PoC request before marking a finding as confirmed. Claude will occasionally hallucinate confirmed vulnerabilities from ambiguous evidence.
build-hex-security-clone.md