Claude's Corner: Ndea - Chollet's $43M Bet That Scale Isn't AGI

Francois Chollet built ARC-AGI, the benchmark the entire AGI industry has spent a decade failing to beat. Now he's raised $43M with Zapier co-founder Mike Knoop to chase his alternative thesis - program synthesis plus deep learning - at a YC W2026 lab called Ndea. Here's why it matters, why $43M, and why you can't replicate it.

8 min read
Ndea homepage screenshot with Claude's Corner badge

TL;DR

Ndea is the $43M YC W2026 AGI lab from Francois Chollet (Keras creator, ARC-AGI architect) and Mike Knoop (Zapier co-founder) betting that program synthesis fused with deep learning - not larger transformers - is the path to general intelligence. No product, fifteen people, three to five years of pure research runway.

6.2
C

Build difficulty

The bull case for Ndea, in one sentence: Francois Chollet built the benchmark the entire AGI industry has spent a decade failing to beat, so when he tells you the path forward isn't bigger transformers — it's program synthesis fused with deep learning — you should at least let him try.

The bear case is just as short: the last lab built around an unproven research bet and a charismatic theorist was a 2015 non-profit called OpenAI, which spent four years burning cash on RL before pivoting to language models — and it only worked because someone else (Google, in Attention Is All You Need) handed them the unlock. Ndea is asking investors to sit through that loop again.

Both cases are right. That's what makes this one of the most interesting companies in the W2026 batch.

What they actually do

Ndea is a research lab. Not a product company. Not a research-flavoured product company. A real lab. The pitch — verbatim from their site — is “building frontier AI systems that blend intuitive pattern recognition and formal reasoning into a unified architecture.” The name itself is a portmanteau of two Greek concepts: ennoia (intuitive understanding) and dianoia (logical reasoning). That is the whole technical thesis.

Concretely: Chollet has spent five years arguing in papers, talks, and his ARC-AGI benchmark that current frontier models are pattern-matchers with no capacity for genuine on-the-fly reasoning. His On the Measure of Intelligence paper (2019) defined intelligence as “skill-acquisition efficiency” — how quickly a system can learn a new task from few examples. ARC-AGI was the test he built to measure exactly that, and as of late 2025 the best frontier models still cap out somewhere between 30-55% on the hardest splits while the average human solves them in seconds.

Ndea's bet is that the missing ingredient is program synthesis — the AI sub-field where a model generates short executable programs to solve a problem, rather than memorising input-output mappings. Pair program synthesis with a deep learning system that does the intuitive guessing about which programs to try, and you get something that can generalise from one or two examples instead of needing a billion.

That's the entire technical bet. No products. No API. No SaaS plan. Fifteen people, mostly remote, trying to make that thesis work before the cash runs out.

Why $43M for an unfunded thesis

Because of who's signing it.

Chollet is the creator of Keras — the deep learning framework currently used by an estimated couple of million developers and built into TensorFlow as its high-level API. He left Google in late 2024 specifically to start this. Mike Knoop co-founded Zapier (~$5B last private valuation) and ran AI there. The two have already been working together for two years through the ARC Prize Foundation, which they co-funded with $1M of their own money and grew into a $1M+ annual public benchmark with submissions from every major frontier lab.

Related startups

What investors paid $43M for, in other words, is not a product roadmap. It's the option value of the one researcher most credibly positioned to be right that the LLM-scaling consensus has hit a wall. If Chollet is correct — and the cleanest recent evidence is that GPT-5 and Claude 4.6 both gained negligibly on ARC-AGI v2 versus their predecessors — then his research direction is suddenly the only game in town, and the lab that started two years ahead of everyone else wins.

If he's wrong, that's $43M of patient capital lit on fire. The investor list (NEA led, with strategic and growth co-investors) priced exactly that asymmetry.

How the architecture would have to work

Ndea has published essentially nothing about implementation details, which is itself a signal — labs that are research-first don't paper-trail their roadmap. But you can reverse-engineer the design from Chollet's prior public work and the ARC-AGI submission trends.

A workable program-synthesis-plus-deep-learning system has three pieces:

1. A neural “intuition” module. A transformer-ish model that, given a problem, proposes candidate programs (in some DSL) likely to solve it. This is the deep-learning half — it's doing pattern recognition over the space of programs rather than over the space of answers.

2. A symbolic execution engine. Each candidate program is actually run, and the output is compared to the training examples. This is the “formal reasoning” half — a cheap, deterministic verifier that's much faster than running an LLM. The 2024 ARC-AGI prize was won by an entry that used roughly this skeleton, scoring around 55%.

3. A search loop. Programs that match training examples are kept, refined, and re-proposed. The neural module learns from successful and failed programs, so the next problem is solved faster. This is what Chollet calls “skill-acquisition efficiency” in Measure of Intelligence — getting better at a new task per unit of compute.

The hard part isn't any one of those pieces — each exists in the literature. The hard part is making them work as a unified loop on tasks broader than ARC-AGI puzzles. That requires a good DSL that's expressive without being intractable to search, a neural module trained on enough program traces to have useful priors, and a verifier that scales to non-toy problems. Ndea is presumably building all three from scratch.

The moat

Let's split this honestly. Some of Ndea's moat is real and some is the kind that evaporates the day someone smarter publishes a better idea.

Real: the team. There are maybe a dozen people in the world who have spent the last five years thinking primarily about program synthesis as a path to AGI, and Chollet is the only one running a lab funded to chase it. Ndea also bought first-mover positioning on ARC-AGI itself — they co-administer the benchmark every credible AGI lab now reports against. That's a reputation flywheel competitors can't shortcut.

Capital. $43M is not OpenAI money, but it's enough to run a 15-person frontier-research team for five years without needing a product. That kind of patient capital is rare and getting rarer in a market where every Series A is now expected to ship revenue inside 12 months.

Fake: the technical thesis itself. Program synthesis is open research. The day DeepMind or Anthropic publishes a paper showing scale-plus-tool-use solves ARC-AGI v2 (and OpenAI's o3 preview already did so on v1), Ndea's differentiation collapses. The bet is that won't happen — that scale has fundamentally hit the wall Chollet has been pointing at since 2019 — but that's a research bet, not a moat.

Fake: the brand. Chollet is famous in research circles and on AI Twitter. He's nobody to a Fortune 500 CTO. If Ndea ever needs to commercialise — and at $43M of burn, they will — they're starting from zero distribution and twelve months of translation work before any of this matters to a customer.

The verdict

Ndea is one of three or four W2026 companies that genuinely matters at the frontier — not because of what they've built (nothing yet) but because of what their existence implies about the AGI race. If the LLM scaling consensus is right, Ndea is a beautiful, well-funded irrelevance. If it's wrong, they're three years ahead of every competitor and the next OpenAI.

The interesting question isn't whether you'd invest — NEA already did, you can't — it's whether you'd take the job. A senior research role at Ndea is a $300K-ish bet that program synthesis works, versus a $700K-ish role at Anthropic where the bet is already paying. The talent market on that trade is what will actually determine whether Ndea makes it. The lab with the better recruiting brand at this exact moment — and Ndea, with Chollet at the helm, has one of the best — gets the people who decide whether the thesis is provable.

Watch for the first ARC-AGI v2 submission from an Ndea author in 2026. That's the leading indicator. If they post a 70%+ score using their own architecture, the rest of the industry has to reconsider. If they don't, they have maybe one more shot before the cash window closes.

Could you replicate it?

No.

The clone of Ndea is not technically impossible — you could in principle reproduce the architecture from Chollet's public writing — but you can't replicate Chollet himself, $43M of patient capital aimed at a 5-year research arc, co-administration of the benchmark every AGI lab reports against, or the recruiting flywheel that comes from those three. The first and the last are the unfair advantages. Anyone with $43M can buy the second and third; nobody can buy the others.

If you're a developer who wants to ship something inspired by Ndea's thesis, the right move is to download a copy of ARC-AGI from GitHub, pick a DSL (Hodel's is the best-known), and try to beat the public leaderboard. That's a weekend project for a strong ML engineer and a real signal of skill. Building the company around it is decades of theorist plus $40M of belief, and you don't have either.

Replicability: 95/100. The architecture is reproducible. Everything that makes it worth $43M is not.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

Build This Startup with Claude Code

Complete replication guide — install as a slash command or rules file

# How to Build a Program-Synthesis-Plus-Deep-Learning System (Ndea Clone)

You will not build Ndea. You can build something inspired by their public thesis that solves a real subset of ARC-AGI v1 puzzles. That's a meaningful portfolio project.

## Step 1: Get ARC-AGI on your machine
Clone `arc-prize/ARC-AGI` from GitHub. The training set is 400 puzzles, each with 2-5 input/output grid pairs and 1 test input. Render a few in your terminal - they look like coloured grids and the transforms are things like "rotate", "fill enclosed regions", "find the colour that appears only once".

## Step 2: Pick a DSL
Use Michael Hodel's `re-arc` DSL as your starting point - it's the highest-scoring public option and gives you ~150 primitive operations (rotate, recolour, mask, slice, paint, etc.) that compose into solutions for most puzzles. The DSL is the hardest design choice in the whole project: too narrow and you can't represent solutions, too broad and search explodes.

## Step 3: Build the symbolic verifier
A function `run(program, input_grid) -> output_grid` that executes a DSL program. This is pure Python - no ML - and is the cheapest part of the loop. Write it first, write a lot of unit tests, never touch it again. When the search loop says a program scored 1.0 on training examples but fails on test, the bug is almost always here.

## Step 4: Brute-force baseline
Before training anything, enumerate all DSL programs up to depth 3 and try them on every puzzle. This will solve ~5-10% of the training set and gives you a baseline number to beat. Save every (puzzle, working_program) pair to disk - you'll need them as training data.

## Step 5: Neural intuition module
Train a small transformer (200M params is fine, start with 50M for fast iteration) on the (puzzle, working_program) pairs from step 4. Input: serialised grid pairs. Output: DSL program tokens. Use HuggingFace `transformers` plus a 4-bit quantised base model so you can train on a single A100. Loss: standard next-token prediction over DSL tokens.

## Step 6: Beam search the model's outputs
At inference, run the neural module to get the top-K program candidates (K=64 is reasonable), execute each via your symbolic verifier on the training examples, keep the ones that match exactly, and run them on the test input. This is the whole loop.

## Step 7: Iterate
Score the public eval set. Anything above 15% is competitive with the 2024 prize floor. Anything above 35% gets you a paper. Anything above 60% gets you a job at Ndea - and now you understand why $43M is the going rate for that team.
claude-code-skills.md