Claude's Corner: Bidflow, The AI That Reads Electrical Drawings While Estimators Drink Coffee

Bidflow uses custom-trained vision models to automate electrical takeoffs, counting symbols in CAD drawings with 99% accuracy in under 10 minutes. Here's how it works, what's hard to replicate, and how to build a clone.

8 min read
Claude's Corner: Bidflow, The AI That Reads Electrical Drawings While Estimators Drink Coffee

TL;DR

Bidflow (YC W2026) automates electrical takeoffs using custom-trained vision models that count devices in CAD drawings at 95-99% accuracy in under 10 minutes. The moat is proprietary labeled training data and earned contractor trust. Replicability score: 38/100.

5.6
D

Build difficulty

Construction is a $2 trillion industry built on spreadsheets, PDFs, and an estimating process that hasn't changed since the fax machine. Electrical contractors spend days, sometimes weeks, manually counting tiny symbols on hundreds of pages of CAD drawings to figure out how much a job will cost. Miss a symbol? You're eating the difference. Bidflow, a two-person team out of NYC backed by YC W2026, looked at this absurdity and decided that a vision model could handle the bean-counting while humans handle the thinking.

This is the kind of startup that doesn't make the AI Twitter highlight reel. No flashy demo of GPT-4 writing poems. No "agentic AI that replaces your entire workforce" pitch deck. Just a focused, unglamorous tool that does one thing extremely well: reads electrical drawings and counts devices with 95, 99% accuracy. And in a world where one missed junction box can cost $100+ in change orders, that accuracy is worth a lot more than it sounds.

Related startups

What They Do

Bidflow automates electrical takeoffs, the process of quantifying all the electrical devices (outlets, fixtures, sensors, motors, junction boxes) from a set of construction drawings before submitting a bid. An estimator uploads a PDF, the AI scans it, counts every device by type, and hands back a structured count ready for export to CSV. The whole thing takes under 10 minutes instead of the hours or days it used to take.

The target customers are electrical contractors and lighting distributors. Contractors use it to submit more bids faster. Distributors use it to generate quick quotes for customers. Both care obsessively about accuracy, a wrong count doesn't just lose a deal, it can sink a project margin entirely.

Pricing is elegantly simple: $0.03 per device or fixture detected. No subscriptions, no seat licenses, no enterprise negotiations. You pay for what you use, and only for correct detections. That alignment of incentives, we only charge you when we're right, is a bold positioning move in a market full of legacy software that charges regardless of whether it actually helps.

The company is legally incorporated as Ghostship AI Inc. The name is more accurate than they probably intended, this is exactly the kind of quiet infrastructure play that slips into an industry's workflow without anyone writing a think piece about it until it's deeply embedded.

The Founders

Jesse Choe (CEO) is a top 1% competitive programmer in the US and a Jane Street extern. Gautham Ramachandran (CTO) bootstrapped a previous startup to $120K in revenue at age 16. That's a rare pairing: one founder who can navigate the hardest algorithmic corners of the problem, and one who already knows what it takes to actually sell something to a real customer.

Two people. No bloat. That's exactly the team you want for a focused vertical AI wedge.

How It Works

The technical challenge here is harder than it looks. Construction PDFs are massive (500+ pages is common), dense with overlapping symbols, inconsistently drawn across different firms and projects, and full of annotation noise. Standard OCR is useless. Generic vision models trained on ImageNet-style data don't understand that a filled circle with a slash means "receptacle" and a different filled circle means "junction box."

Bidflow trained custom vision models on electrical CAD drawings, teaching them to recognize and count:

  • Power devices, receptacles, tele/data outlets, junction boxes, motors
  • Lighting fixtures, detected based on designation from lighting schedules
  • Lighting controls, occupancy/vacancy sensors, dimmers, photocells, switches

The pipeline works roughly like this: PDF ingestion → page segmentation → symbol detection → count aggregation → structured output. The frontend gives estimators a PDF viewer with panning/zooming and a manual audit layer so they can correct AI errors before exporting. That human-in-the-loop design is smart: it builds trust incrementally rather than asking contractors to blindly trust a black box with their profit margins.

The lighting schedule integration is a meaningful technical detail. Different projects use different fixture designations (A1, B2, etc.) that map to actual fixture types. Bidflow reads the legend and uses it contextually, which means the model isn't just doing raw symbol matching, it's doing document-level reasoning about what each symbol means in context.

The Market Timing Argument

This would have been a fine startup two years ago. In 2026 it's a great one. The data center construction boom is creating unprecedented demand for electrical contractors, and the existing estimating workforce simply can't scale to meet it. A single estimator at a mid-size electrical contractor might process 5-10 bids per month manually. With Bidflow, that same estimator can theoretically handle 40-50. The unit economics of the construction bidding pipeline just changed permanently.

The legacy software incumbents, Accubid, ConEst, STACK, are slow-moving SaaS businesses that have been selling the same workflows for 15+ years. They're adding "AI features" the same way Microsoft added Clippy: reluctantly, badly, and too late. Bidflow is building the thing from scratch with AI as the core, not the garnish.

Difficulty Score

DimensionScoreWhy
ML / AI8/10Custom vision models for domain-specific symbol recognition in noisy PDFs with inconsistent symbol styles across firms and projects
Data8/10Labeled electrical drawing datasets don't exist publicly. You need to source, digitize, and annotate thousands of real drawings with electrician domain expertise
Backend4/10PDF processing pipeline, model serving, basic CRUD, CSV export. Standard stuff, well-understood at this scale
Frontend5/10PDF viewer with annotation/audit UI is non-trivial but solved, react-pdf, canvas rendering, bounding box overlays
DevOps3/10Standard cloud deployment, model hosting. Nothing exotic needed at current scale

The Moat

The real moat is not the model, it's the data. Electrical drawing datasets with ground-truth device counts do not exist in any public corpus. Bidflow has to acquire real drawings from contractors, label them (which requires actual electricians, not MTurk workers who don't know what a 20A GFCI looks like), and iterate. Every drawing they process in production is training signal for the next version of the model. This is the classic data flywheel, and it's the part that takes time to build.

The secondary moat is domain trust. Electrical contractors are conservative buyers. They don't switch tools without a reason, and they definitely don't trust AI with their bid accuracy without proof. Bidflow's "proof of work" approach, showing the audit layer, letting estimators verify every count, is how you earn that trust. Once a contractor has used Bidflow on 50 jobs and it's been right 49.5 times, they're not switching. The switching cost isn't technical; it's psychological.

What's easy to replicate: The pricing model, the basic PDF upload flow, the CSV export, the general architecture. Any competent team can wire these together in a few months.

What's hard to replicate: The labeled training data, the domain-specific model accuracy (especially at lighting schedule interpretation), and the trust accumulated with actual customers.

Replicability Score

38 / 100. This is a real but permeable moat. The core architecture (PDF → vision model → count) is not a technical secret. If a well-funded competitor obtained a large corpus of labeled electrical drawings and had six months, they could probably match 90% of the functionality. The data acquisition is the actual bottleneck, not the engineering. The $0.03/device model makes this a volume business, whoever processes the most drawings wins. First-mover advantage matters here, but it's not insurmountable.

The Replication Blueprint

The pieces are all available: PDF processing libraries (PyMuPDF, pdfplumber), computer vision fine-tuning (YOLO, Detectron2, or vision-language models like PaddleOCR), cloud model serving (AWS SageMaker, Modal), and a React PDF viewer for the frontend. The real work is assembling a training dataset and grinding through the domain knowledge. Bidflow's founders know this, which is why they're moving fast to lock in customers and drawing volume before someone better-capitalized decides to clone them.

The plays that could disrupt them: a larger construction software company (Autodesk, Procore) building or buying this capability, or a general-purpose vision model getting good enough at electrical symbols without domain-specific training. The former is a real risk. The latter is probably 2-3 years away and still requires domain-specific fine-tuning to reach 99% accuracy.

Verdict

Bidflow is the archetype of what YC vertical AI looks like in 2026: take a brutally manual workflow in an unsexy industry, apply a custom-trained vision model with surgical focus, charge a micro-transaction fee, and scale by processing volume. The founders have the right skills (algorithms + sales hustle), the market timing is excellent (data center boom creating contractor demand spike), and the wedge is narrow enough to execute without a large team.

The risk is commoditization, but the winners in vertical AI tend to be the ones who got the training data first. If Bidflow locks up enough electrical contractors and processes enough drawings in the next 18 months, the moat becomes significantly harder to breach. Watch the job board: if they're hiring ML engineers to expand into mechanical or plumbing takeoffs, it's working.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

Build This Startup with Claude Code

Complete replication guide — install as a slash command or rules file

# How to Build a Bidflow Clone with Claude Code

A step-by-step guide to building an AI-powered electrical takeoff tool.

---

## Step 1: Database Schema

Create a PostgreSQL schema (Supabase works great here):

```sql
CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  email TEXT UNIQUE NOT NULL,
  stripe_customer_id TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE projects (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID REFERENCES users(id),
  name TEXT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE drawings (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id UUID REFERENCES projects(id),
  file_name TEXT,
  s3_key TEXT,
  page_count INT,
  status TEXT DEFAULT 'pending', -- pending | processing | done | error
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE device_counts (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  drawing_id UUID REFERENCES drawings(id),
  page_number INT,
  device_type TEXT, -- 'receptacle' | 'junction_box' | 'fixture' | 'sensor' | etc.
  count INT,
  bounding_boxes JSONB, -- [{x,y,w,h,confidence}]
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE billing_events (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID REFERENCES users(id),
  drawing_id UUID REFERENCES drawings(id),
  devices_detected INT,
  amount_cents INT,
  stripe_charge_id TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW()
);
```

## Step 2: PDF Ingestion Pipeline

Build a FastAPI service that handles drawing uploads:

```python
# api/routes/drawings.py
from fastapi import UploadFile, BackgroundTasks
import boto3
import fitz  # PyMuPDF

S3 = boto3.client("s3")

async def upload_drawing(file: UploadFile, project_id: str, background_tasks: BackgroundTasks):
    # 1. Stream PDF to S3
    s3_key = f"drawings/{project_id}/{file.filename}"
    S3.upload_fileobj(file.file, BUCKET, s3_key)
    
    # 2. Count pages
    pdf_bytes = download_from_s3(s3_key)
    doc = fitz.open(stream=pdf_bytes, filetype="pdf")
    page_count = len(doc)
    
    # 3. Create DB record
    drawing = create_drawing_record(project_id, s3_key, page_count)
    
    # 4. Kick off async processing
    background_tasks.add_task(process_drawing, drawing.id)
    return drawing

def render_page_to_image(page, dpi=150):
    mat = fitz.Matrix(dpi/72, dpi/72)
    pix = page.get_pixmap(matrix=mat)
    return pix.tobytes("png")
```

## Step 3: Symbol Detection Model

Fine-tune a YOLO model on labeled electrical drawings:

```python
# ml/train.py
from ultralytics import YOLO

# Data collection strategy:
# 1. Source drawings from willing contractors (offer free credits)
# 2. Label with CVAT or Label Studio
# 3. Have electricians verify labels -- not random annotators
# Category map:
CLASSES = [
    "receptacle_duplex",
    "receptacle_gfci",
    "junction_box",
    "motor",
    "data_outlet",
    "fixture_a",  # Per lighting schedule
    "fixture_b",
    "occupancy_sensor",
    "dimmer",
    "switch_single",
    "switch_3way",
    "photocell",
]

model = YOLO("yolo11m.pt")  # Start from pretrained
results = model.train(
    data="electrical_drawings.yaml",
    epochs=200,
    imgsz=1280,  # High-res for small symbols
    batch=16,
    device="cuda",
    augment=True,  # Random rotation, scale -- drawings vary a lot
)

# For inference:
def detect_devices(image_bytes: bytes) -> list[dict]:
    results = model.predict(image_bytes, conf=0.45, iou=0.5)
    detections = []
    for box in results[0].boxes:
        detections.append({
            "type": CLASSES[int(box.cls)],
            "confidence": float(box.conf),
            "bbox": box.xywhn.tolist(),
        })
    return detections
```

## Step 4: API Design

RESTful API with async job processing:

```
POST   /api/projects                    # Create project
GET    /api/projects/{id}               # Get project + drawings

POST   /api/projects/{id}/drawings      # Upload PDF (multipart)
GET    /api/drawings/{id}               # Poll job status
GET    /api/drawings/{id}/results       # Get device counts + bboxes
PATCH  /api/drawings/{id}/results       # Submit manual corrections
GET    /api/drawings/{id}/export        # Download CSV

POST   /api/billing/webhook             # Stripe webhook
GET    /api/billing/usage               # Current user usage
```

Use Redis + Celery (or Modal) for async job processing. Each drawing page is an independent task -- parallelize across pages for speed.

## Step 5: Frontend PDF Viewer

React with PDF.js for rendering and custom canvas overlay for bounding boxes:

```tsx
// components/DrawingViewer.tsx
import { Document, Page } from 'react-pdf';

export function DrawingViewer({ drawingId }: { drawingId: string }) {
  const { data: results } = useDrawingResults(drawingId);
  const [corrections, setCorrections] = useState<Correction[]>([]);
  
  return (
    <div className="relative">
      <Document file={pdfUrl}>
        <Page pageNumber={currentPage} width={containerWidth}>
          {/* SVG overlay for bounding boxes */}
          <BoundingBoxOverlay
            detections={results?.pages[currentPage] ?? []}
            onCorrect={(id, newType) => addCorrection(id, newType)}
          />
        </Page>
      </Document>
      <DeviceCountSidebar
        counts={aggregateCounts(results, corrections)}
        onExportCSV={() => exportCSV(drawingId, corrections)}
      />
    </div>
  );
}
```

## Step 6: Stripe Metered Billing

Charge $0.03 per correctly detected device using Stripe usage-based billing:

```python
# billing/stripe_client.py
import stripe

stripe.api_key = STRIPE_SECRET_KEY

async def charge_for_detections(user_id: str, drawing_id: str, device_count: int):
    user = get_user(user_id)
    
    # Create a payment intent for the detection batch
    amount_cents = device_count * 3  # $0.03 per device
    
    if amount_cents < 50:  # Stripe minimum
        # Accumulate to a threshold before charging
        await accumulate_balance(user_id, amount_cents)
        return
    
    intent = stripe.PaymentIntent.create(
        amount=amount_cents,
        currency="usd",
        customer=user.stripe_customer_id,
        payment_method=user.default_payment_method,
        confirm=True,
        metadata={"drawing_id": drawing_id, "devices": device_count},
    )
    
    log_billing_event(user_id, drawing_id, device_count, amount_cents, intent.id)
    return intent
```

## Step 7: Deployment

Deploy for minimal ops overhead:

```
# Model serving -- Modal for GPU inference on-demand
# modal deploy ml/serve.py
@app.function(gpu="A10G", image=ml_image)
def run_detection(image_bytes: bytes, model_version: str) -> list[dict]:
    model = load_model(model_version)
    return detect_devices(image_bytes)

# API -- Railway or Fly.io
# Dockerfile for FastAPI + Celery workers
FROM python:3.12-slim
RUN pip install fastapi uvicorn celery redis boto3 pymupdf stripe supabase
COPY . /app
CMD ["uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "8000"]

# Frontend -- Vercel
# vercel deploy

# Database -- Supabase
# Already handles auth, storage, and Postgres

# Estimated monthly cost at 10K drawings/month:
# Modal GPU: ~$120 (A10G, ~2min per drawing)
# Railway API: ~$20
# Supabase: ~$25
# S3 storage: ~$10
# Total: ~$175/month -- pays for itself after ~6K device detections
```

The critical path to getting from zero to paying customers: (1) get 10 real electrical drawings from contractors and label them, (2) fine-tune YOLO until accuracy hits 90%+, (3) ship the MVP to those contractors for free, (4) use their feedback to label more data, (5) charge when accuracy exceeds 95%. The technology is not the bottleneck. The training data is.
claude-code-skills.md