StartupHub.ai has learned exclusively that Cloudflare’s new Pay Per Crawl marketplace has it's sights set on a figure of $500 million in revenue generated from its first year alone. A source close to the company shared this forecast.
This is massive estimate and it reveals how serious Cloudflare is about turning AI scraping into real money for publishers. It also reveals the rampant and unscrupulous web scraping practices employed today, especially by LLM companies.
The marketplace is in beta now. It lets websites charge AI crawlers every time they fetch content.
Cloudflare controls about 20% of global web traffic and already runs much of the internet’s payment infrastructure. That makes this new revenue stream instantly scalable. Big publishers like Condé Nast, Time, and The Atlantic are already testing it. We do know they signed licensing deals with OpenAI and the like, but it remains to be seen how favorable those deals were.
The writing on the wall says AI bots are eating their content and not giving anything back.
Our back of the envelope math points to a $300 million revenue opportunity with basic assumptions as well, assuming Cloudflare takes 1-2% payment fees only.
Why are these bots scraping everything in sight? It comes down to two needs. First is pretraining—LLMs must digest mountains of text to learn how language works. Second is inference—some models crawl the web constantly to stay up to date. When you see a bot labeled “GPTBot,” that’s OpenAI. But nobody really believes that’s their only crawler. It’s very possible they have a whole fleet of unnamed bots or work with partners who scrape under generic labels. Some even spoof their identity to look like browsers. And it’s not just big names—there’s a whole shadow economy of data brokers and scrapers collecting content at scale.
At Cannes Lions by Axios earlier this month, Cloudflare CEO Matthew Prince laid it out in plain language. "Ten years ago, Google crawled two pages for every visitor it sent you. Today, it’s 18 pages per visitor. OpenAI crawls 1,500 pages for every click it drives. Anthropic? 60,000 to 1."
He said AI companies are building products that strip out links and keep users locked in. “People aren’t following the footnotes,” Prince warned. For publishers, that means no traffic and no revenue.
He doesn’t think basic licensing deals are the answer. He believes great content deserves higher prices. Not all information is equal—reporting and original research should cost more than memes.
Google isn’t blind to this. They’re rolling out ways for publishers to get paid or to block AI crawlers altogether. Cloudflare’s move is forcing the big platforms to rethink how they compensate content creators.
This is a turning point for the web. Cloudflare is betting that the era of free scraping is ending. If you run a site, you’re about to have real options. If you build AI, your real costs are about to surface.
In addition, the notion of AI Agents having the wherewithal to have unfettered access to the internet was addressed last week when ChatGPT Agent was finally debuted. It purported to have the capacity to run free across the internet, accessing sites, scraping data, being a digital employee. Shortly after its release, all users realized the Agent was totally incapable of accessing the breadth of websites needed. This roadblock is the doings of companies like Cloudflare, acting as referee to the possibility of rampant Agentic bots.
Updated on July 27 to address ChatGPT Agent and it's inability to access many websites.

