DeepSeek-V3, the model that turned heads in early 2025, cost roughly $5.6 million to train on H800 GPUs, according to the company's own technical report — a number that helped reframe global assumptions about the compute cost of frontier AI. Its successor, DeepSeek-V4-Pro, released on April 24, 2026, scales to 1.6 trillion total parameters while keeping API input pricing at $1.74 per million tokens, around 13 times cheaper than comparable US models at launch.
Liang Wenfeng's DeepSeek: $5.6M Training Budget, $45B Valuation
DeepSeek-V3 cost $5.6 million to train. V4-Pro charges $1.74 per million tokens. And Liang Wenfeng is now raising $10 billion at a $45 billion valuation — here is what we know about DeepSeek's finances.
Related startups
How DeepSeek built frontier AI for under $6 million
The headline number from DeepSeek's V3 technical report, published in December 2024 on arXiv, is 2.788 million H800 GPU hours for the full training run, which at roughly $2 per GPU-hour comes to approximately $5.576 million. The model has 671 billion total parameters but uses a Mixture-of-Experts architecture that activates only 37 billion per token, reducing inference and training compute relative to a dense model of the same nominal size. Pre-training covered 14.8 trillion tokens in under two months on a 2,048-GPU cluster.
V4-Pro, released six months later, extends this architecture: 1.6 trillion total parameters, 49 billion active per token, and a default one-million-token context window. At launch it scored 80.6% on SWE-bench Verified, the highest coding benchmark result of any model at the time of release, according to benchmark tracker BenchLM. DeepSeek labels V4-Pro a preview release, meaning final performance figures may shift before general availability.
The scaling pattern across DeepSeek's model generations reflects a deliberate architectural bet. Rather than chase raw parameter count across all weights, the team progressively increases total-model size while keeping the active-parameter footprint small. V4-Flash, the lighter version released alongside V4-Pro, uses just 13 billion active parameters drawn from a 284-billion-parameter pool, making it cheaper to serve than V3 on a per-token basis despite being a newer model.
The $10 billion fundraise and a valuation that tripled in six weeks
Until April 2026, DeepSeek had no external investors. High-Flyer Capital Management funded all research spending directly, without venture-capital dilution or a public markets timeline. That changed when the company opened its first external round, targeting at least 300 million dollars at a valuation above $10 billion, according to reports in April 2026. The China Integrated Circuit Industry Investment Fund — the state-backed vehicle known colloquially as the "Big Fund" — is leading the round, with Tencent Holdings and Alibaba Group in separate talks to participate, per Bloomberg.
By late May 2026, reported valuation had reached $45 billion, per TechCrunch, up from a $3.4 billion secondary-market figure in 2025. Liang himself is expected to commit roughly 20 billion yuan (approximately $2.7 billion) of personal capital to the round. Battery giant CATL is reportedly considering a 5 billion yuan contribution. High-Flyer remains the controlling shareholder throughout. Liang controls close to 90% of the company.
The strategic context is disclosed clearly in communications to potential investors: DeepSeek's management told the round's participants that the startup will prioritise groundbreaking AI research over short-term commercialisation, per Bloomberg's May 22 reporting. The Information reported separately that DeepSeek is simultaneously beginning to plot revenue efforts, suggesting a two-track structure — research-first public messaging, commercial product work running in parallel. Compare this with how Ilya Sutskever's SSI has framed a similar research-over-revenue posture while raising $32 billion at a valuation many analysts consider pre-revenue.
API pricing as a structural weapon against US incumbents
DeepSeek's commercial strategy since V3 has been aggressive pricing rather than traditional enterprise sales cycles. V4-Flash, the speed-optimised model released April 24, 2026, costs $0.14 per million input tokens on a cache-miss basis, according to DeepSeek's own API documentation — 18 times cheaper than GPT-5.4 at $2.50 per million and 36 times cheaper than Claude Opus 4.7 at $5.00 per million, per pricing aggregator CostGoat. V4-Pro launched with a 75% discount, per The Next Web, and DeepSeek cut cache-hit prices across the entire API suite to one-tenth of original rates on April 26.
The pricing philosophy maps directly to Wenfeng's stated views on competitive dynamics. In a widely cited interview published by The China Academy, he argued: "In disruptive tech, closed-source moats are fleeting. Even OpenAI's closed-source model can't prevent others from catching up. Therefore, our real moat lies in our team's growth — accumulating know-how, fostering an innovative culture." The implication is that pricing is a tool to establish developer adoption, not a path to margin expansion.
The downstream effect inside China's AI market is already visible. Domestic competitors including Zhipu's GLM 5.1 and Moonshot's Kimi K2.6 have faced direct pricing pressure since DeepSeek's aggressive cuts, per market analysis from CloudZero. The pattern resembles what Sam Altman's OpenAI did to API pricing globally in 2023 and 2024 — except DeepSeek is doing it from a cost base that appears structurally lower, not just subsidised.
What it means
DeepSeek's financial structure is unlike that of any other AI lab operating at this scale. There is no venture board setting a return timeline, no IPO process shaping disclosure practices, and the primary backer — High-Flyer — is a quantitative fund that recorded a 57% gain in 2026 per Bloomberg, providing ongoing capital without requiring DeepSeek to show revenue. The first external funding round preserves that structure: Liang's personal commitment dwarfs the institutional cheques, and High-Flyer retains control. Whether the research-over-commercialisation framing survives the addition of state-fund and tech-giant investors is the central question as the round closes.
Sources
Bloomberg: DeepSeek Founder Avows AGI Goal Ahead of $10 Billion Funding
Bloomberg: DeepSeek Founder Liang's Funds Surge 57% as China Quants Boom
The Information: DeepSeek To Raise More than $7 Billion as Startup Plots Revenue Efforts
TechCrunch: DeepSeek could hit $45B valuation from its first investment round
arXiv: DeepSeek-V3 Technical Report
The Next Web: DeepSeek cuts V4-Pro prices by 75%
DeepSeek API Docs: Models & Pricing
The China Academy: Interview with DeepSeek Founder — We're Done Following. It's Time to Lead.
BenchLM: DeepSeek V4 Pro Benchmarks 2026
CloudZero: DeepSeek pricing 2026
Editorial standards: every claim is sourced. Tips: [email protected]