Blog

/

How the Neocloud Business Works

June 7, 2026

How the Neocloud Business Works

Neocloud explained: $25B market in 2025, take-or-pay GPU contracts, 70% breakeven utilization, and why time-to-power decides who profits.

A neocloud is a pure-play cloud provider that rents GPU compute for AI training and inference, as opposed to general-purpose hyperscalers like AWS or Azure. The category generated over $25 billion in revenue in 2025 and is forecast to approach $400 billion by 2031 (Synergy Research, 2026). The business model is simple to state and brutal to execute: sign multi-year take-or-pay contracts for GPU capacity, borrow against those contracts, and race depreciation to the bank.

This post covers what a neocloud is → who the players are and how big they've become → the unit economics → the contract and debt machinery → why power, not GPUs, is now the constraint that decides who wins.

What a neocloud actually is

The term emerged in late 2024 and went mainstream through 2025. The canonical analyst framing comes from SemiAnalysis's "AI Neocloud Playbook and Anatomy" (2024), which defines neoclouds as "a new breed of cloud compute provider focused on offering GPU compute rental." No CRM hosting. No object storage empires. GPUs, networking, and an invoice.

SemiAnalysis segments the market into four tiers: hyperscalers, Neocloud Giants (CoreWeave, Nebius, Lambda, Crusoe), emerging neoclouds, and brokers/aggregators at the spot-market end. By 2025, Gartner, Synergy, Dell'Oro, and McKinsey all tracked GPU-as-a-Service as its own category. The neocloud stopped being a curiosity and became an asset class.

The players, and the numbers behind them

CoreWeave is the reference case. It IPO'd in March 2025 at $40 a share and a ~$23B valuation, with Nvidia anchoring the book with a $250M order. By full-year 2025 it had reached roughly $5B in revenue, the fastest cloud in history to that mark, with a revenue backlog that grew from $66.8B at end-2025 to $99.4B by March 2026 (CoreWeave SEC filings, 2026). One company. A backlog approaching $100 billion.

The rest of the giant tier moves at similar speed. Nebius signed a $17.4–19.4B contract with Microsoft and a ~$3B Meta deal, targeting $7–9B ARR by end-2026 from roughly $1B a year earlier (Nebius SEC filings, 2025–26). Lambda raised over $1.5B in late 2025 and is eyeing an IPO. Crusoe, builder of OpenAI's Stargate campus in Abilene, was valued around $10B. In Europe, UK-based Nscale hit a $14.6B valuation in March 2026 and is deploying 66,000 Rubin GPUs in Portugal for Microsoft.

Notice the pattern in those customer names. Microsoft, OpenAI, Meta, Anthropic. The anchor tenants of the neocloud business are the hyperscalers and frontier labs themselves. AI compute demand has outgrown hyperscaler balance sheets, so the overflow gets contracted out. An entire industry has been built in the overflow.

The unit economics: utilization or death

Here's the engine room. A neocloud buys (mostly borrows for) GPU clusters, then rents them out by the hour or by the year. Three numbers decide everything.

First: the rental price, which falls every year. H100 rentals peaked above $8/hour in 2023. By early 2026 the budget-tier norm was $2.85–3.50/hour, a 64–75% decline (Silicon Data, 2026). Newer silicon holds a scarcity premium for a while (B200s average ~$6.50/hour and GB200 racks ~$17.85/hour in 2026), and the playbook is to ride each generation's premium before it erodes. Which is why every neocloud lines up early for Blackwell and now Rubin allocations, while Hopper-class GPUs cascade down into cheaper inference work.

Second: utilization. A debt-financed cluster breaks even around 70% utilization. On a 1,024-GPU H100 cluster, the difference between 55% and 85% utilization is minus $330k versus plus $340k per month (American Compute, 2026). Bare-metal gross margins run 55–65% before depreciation. There is, as the same analysis puts it, almost no margin of safety.

Third: depreciation, the fight nobody can settle. Most operators book GPUs over 5–6 year useful lives. In November 2025, Michael Burry argued the economic life is 2–3 years, implying ~$176B of understated depreciation across the big AI spenders for 2026–2028. The counter-evidence is real too: CoreWeave reports A100s from 2020 still re-contracting at ~95% of original rates as they cascade into inference. Both things can be true. The accounting life is a bet on where the workload mix goes.

The contract machinery: take-or-pay, then leverage

The core instrument is the multi-year take-or-pay contract. A customer commits to fixed GPU capacity at a fixed rate for 2–5 years and pays whether or not they use it. The neocloud then borrows against that contracted cash flow and amortizes the debt over the contract term.

Watch how fast lenders learned to love this. CoreWeave's first GPU-backed facility in 2023 was $2.3B at roughly 15% all-in. By March 2026, its $8.5B "DDTL 4.0" became the first investment-grade rated GPU-backed financing: Moody's A3, priced at SOFR+2.25%. From 15% to roughly 6% in three years. Lenders now treat contracted GPU cash flows like infrastructure assets, the way they treat toll roads.

The risks are equally legible, and CoreWeave's own filings name them. Customer concentration (Microsoft alone was 67% of its 2024 revenue). Refresh risk, since each Nvidia generation compresses prior-generation pricing. Leverage against contracts that may not renew at the same rates. And one more, which has quietly become the biggest.

Power is the business model now

A neocloud's P&L is a race between contracted revenue and GPU depreciation. The GPUs start depreciating the day they ship. The revenue starts the day they're energized. Everything between those two dates is pure loss.

This is why the industry conversation flipped in 2026. One earnings roundup titled it "Neoclouds Shift From GPU Race to Power Wars." Backlogs like CoreWeave's $99B are signed against capacity that does not exist yet. The constraint isn't getting GPUs anymore. It's getting megawatts, and getting them energized fast.

You can see the response in how neoclouds source capacity. Three routes, in rising order of control:

  1. Colocation and powered shells. The fast default. CoreWeave contracted ~590 MW across five of Core Scientific's converted bitcoin-mining sites, worth over $10B across 12 years (Core Scientific SEC filings, 2025). Old mines had one thing nobody else did: grid connections.
  2. Self-build greenfield. Crusoe took the first 200 MW phase of Stargate Abilene live roughly 12–15 months after groundbreaking, scaling toward 1.2 GW. Exceptionally fast for hyperscale, and possible because Crusoe is an energy company first. Many operators are going further and generating their own power on-site rather than queuing for the grid.
  3. Modular and prefab capacity. The newest route, and the fastest. Modular deployments quote roughly 6 months versus 2–3 years conventional (IEEE Spectrum, 2026), and Crusoe now manufactures its own prefabricated AI factory modules. Factory-built capacity turns the data center from a construction project into a procurement line item. The full model is in our modular data center guide, and the cost mechanics here.

Run the math on why this matters. Take a 1,024-GPU B200 cluster, conservatively $50M+ of IT capital. At a 6-year book life that's ~$700k of depreciation per month, against zero revenue until energization. A modular route that saves 12 months versus a conventional build doesn't just save construction cost. It moves roughly $8M of dead depreciation back onto the revenue side of the ledger, on one cluster. Neocloud giants operate hundreds.

The European angle

Europe has its own neocloud wave: Nebius building in Finland and Western Europe, Nscale in Portugal and the UK, plus sovereign AI programs spinning up national GPU capacity. They face everything above plus two extra constraints: grid connection queues of 7–10 years in the FLAP-D hubs, and a compliance regime (EED reporting from 500 kW of IT load, Germany's binding PUE ≤1.2 for new builds from July 2026) that applies the moment capacity energizes. We've mapped that regime in our EU data center regulations guide.

For a European neocloud, the playbook compresses to one sentence: secure power early, deploy capacity that's compliant by design, and never let the building be the reason GPUs sit dark. What that build actually involves, end to end, is the subject of our AI data center build guide, and if your GPUs are already on order, start with so you have GPUs, then what?

The neocloud business isn't a technology business. It's a logistics business with a balance sheet, where the inventory melts. The winners are the ones who get the melting inventory plugged in first.

Modular Data Centers by ModulEdge

ModulEdge designs modular data centers for enterprises that need on-prem, high-density compute now — not after multi-year construction or grid upgrades.

  • 5–150 kW per rack, engineered for edge compute and AI
  • Integrated power, air/water cooling, fire, monitoring, and security
  • Climate- and site-specific customization, including free cooling
  • Designed to meet Tier III/Tier IV principles
  • Typical custom build cycles: 3–6 months

FAQ

What is a neocloud?

A neocloud is a pure-play cloud provider focused on renting GPU compute for AI training and inference, as opposed to general-purpose hyperscalers. The category was defined analytically by SemiAnalysis in 2024 and includes CoreWeave, Nebius, Lambda, and Crusoe at the giant tier, with dozens of emerging providers and spot marketplaces below them.

How big is the neocloud market?

Neocloud revenue exceeded $25 billion in 2025, with Q4 2025 alone reaching $9 billion, up 223% year over year. Synergy Research forecasts roughly $180 billion by 2030 and approaching $400 billion by 2031, a ~58% compound annual growth rate.

How do neoclouds make money?

Primarily through multi-year take-or-pay contracts: customers commit to fixed GPU capacity at fixed rates for 2–5 years and pay regardless of usage. Neoclouds borrow against these contracted cash flows to finance GPU purchases. On-demand and spot rental is the residual market and carries full price risk.

What does GPU compute cost to rent in 2026?

H100s rent for roughly $2.85–3.50/hour at the budget tier, down 64–75% from the 2023 peak above $8/hour. B200s average about $6.50/hour and GB200 rack-scale capacity around $17.85/hour (Silicon Data, 2026). Each new Nvidia generation carries a scarcity premium that erodes as supply catches up.

What utilization does a neocloud need to be profitable?

A debt-financed GPU cluster breaks even at roughly 70% utilization. On a 1,024-GPU H100 cluster, 55% utilization loses about $330k per month while 85% earns about $340k per month (American Compute, 2026). Gross margins of 55–65% before depreciation leave little room for idle capacity.

What is the biggest risk in the neocloud business model?

Three dominate: customer concentration (CoreWeave's largest customer was 67% of 2024 revenue), GPU obsolescence as each Nvidia generation compresses prior-generation pricing, and time-to-power: GPUs depreciate from the day they ship, so every month waiting for an energized facility is direct economic loss.

Why do neoclouds use modular data centers?

Speed to revenue. Modular deployments run roughly 6 months versus 2–3 years for conventional builds, and a year saved on a $50M cluster recovers roughly $8M of otherwise-dead depreciation. With take-or-pay backlogs signed against capacity that doesn't yet exist, faster energization converts directly into recognized revenue.

Yuri Milyutin

Managing Partner at ModulEdge