Neoclouds Are Building on Borrowed Time If They Stay Pure Infrastructure

April 13, 2026
Neo Clouds
World
Akash

Share the Post:

The neocloud infrastructure commoditisation debate has moved from theoretical to operational. GPU-as-a-service providers that barely existed before 2022 are now operating at gigawatt scale, signing multi-billion dollar contracts with hyperscalers and enterprises, and commanding valuations that would seem implausible for what is, at its core, a hardware rental business. CoreWeave, Nebius, Lambda Labs, and a growing tier of regional operators have built real businesses by doing what hyperscalers could not do fast enough: deploying GPU capacity quickly in a market where demand was outrunning supply.

The problem is that the conditions that made the neocloud model lucrative are temporary, and several of them are already reversing. GPU prices are falling. Hyperscalers are catching up on capacity. Enterprise buyers are getting smarter about what they actually need from AI infrastructure. The neoclouds that thrive through the next phase of this market will be the ones that used the current window to build something that is not just hardware. The ones that did not are going to find the next three years significantly harder than the last three.

What Made the Neocloud Infrastructure Window Possible

The neocloud opportunity was created by a specific market failure: hyperscalers could not deploy AI compute capacity fast enough to meet enterprise demand, and they could not match the flexibility that developers and AI-native companies needed when experimenting with new models and hardware configurations. Neoclouds stepped into that gap, offering access to NVIDIA’s latest GPUs on shorter lead times, at configurations that hyperscalers did not offer, and with the kind of direct technical engagement that a hyperscaler’s support model is structurally unable to provide.

That gap was real and it generated real revenue. But it was always a window, not a permanent market structure. Hyperscalers have been closing the capacity gap through aggressive infrastructure investment, and as that capacity comes online, the simple scarcity premium that neoclouds charged for GPU access will erode. NVIDIA’s own hardware cost curve is working against them too. As each GPU generation delivers more inference performance per dollar, the total compute cost per workload falls, reducing the revenue per rack that infrastructure-only neoclouds can generate.

Neoclouds have been redefining competition beyond hyperscalers partly by being faster and more flexible, but speed and flexibility are capabilities that well-resourced competitors can replicate. They are not structural advantages that compound over time. A neocloud that is primarily selling raw GPU hours is selling a commodity, and commodity markets converge toward the lowest sustainable margin. The question is not whether that convergence happens. It is which neoclouds will have built something defensible before it does.

The Commoditisation Pressure Is Already Visible

The evidence that GPU-as-a-service is commoditising is already in the market. Spot GPU pricing on major neocloud platforms has declined substantially from 2023 peaks as supply has caught up with demand in the most accessible hardware tiers. Enterprise customers who locked in long-term contracts at peak pricing are not renewing at the same rates when those contracts expire, because the alternatives have multiplied and the pricing has normalised.

The inference cost curve compounds this pressure. NVIDIA’s Vera Rubin platform is projected to deliver ten times lower token cost than Blackwell for inference workloads. As customers upgrade to newer hardware, the revenue per unit of AI work they consume falls, even if the volume of work they run increases. A neocloud that prices on compute capacity, rather than on the value of the AI outcomes that compute enables, faces a structural revenue headwind that hardware efficiency improvements will continuously widen.

Hyperscalers are also learning to serve the segments that neoclouds initially captured. AWS, Azure, and Google Cloud have all expanded their GPU instance offerings, improved their time-to-provisioning, and built more direct customer engagement models for AI workloads. The flexibility advantage that neoclouds held in 2022 and 2023 is narrowing. Enterprise buyers who previously chose neoclouds because hyperscaler AI infrastructure was inflexible and slow to access are finding that the hyperscaler offerings have improved enough to reconsider.

What Moving Up the Stack Actually Means

Moving up the stack is an easy thing to say and a hard thing to do. For a neocloud, it means building capabilities that sit above raw compute: managed inference services, AI model serving infrastructure, fine-tuning pipelines, observability and monitoring tools, cost optimisation services, and the compliance and governance frameworks that regulated enterprise customers require. It means becoming a partner in AI deployment rather than a supplier of GPU hours.

The rise of inference clouds as a distinct tier represents the most immediate opportunity for neoclouds to escape pure infrastructure commoditisation. Inference cloud operators who offer optimised serving infrastructure, model routing, semantic caching, and latency guarantees on top of their GPU capacity are building a differentiated product that enterprise customers will pay a premium for over raw compute access. That premium is defensible in ways that raw GPU pricing is not, because it requires operational expertise and software depth that pure hardware operators do not have.

The neoclouds best positioned to make this transition are those that have already built significant enterprise customer relationships and understand what those customers actually need from AI infrastructure. The ones that have been selling primarily to AI-native startups and individual developers are in a weaker position, because the software and compliance sophistication required to serve regulated enterprise workloads is not something that can be assembled quickly under competitive pressure.

Why Sovereign Partnerships Are Part of the Answer

One category of differentiation that neoclouds can pursue that hyperscalers find structurally difficult is sovereign infrastructure. Government customers, regulated enterprises, and organisations in markets where data sovereignty requirements make shared hyperscaler infrastructure problematic are increasingly willing to pay premium pricing for infrastructure that meets their specific compliance requirements. A neocloud that builds the certifications, the operational models, and the jurisdictional infrastructure to serve these customers is building a customer base that hyperscalers struggle to compete for effectively.

Sovereign partnerships also provide capital access and revenue stability that purely commercial neocloud businesses struggle to achieve. A neocloud with a committed sovereign customer base, whether through government contracts, sovereign fund co-investment, or certified compliance infrastructure for regulated industries, has a contracted revenue floor that makes the infrastructure investment case significantly stronger than pure market demand forecasting would support.

The neoclouds that will define what the sector looks like in five years are the ones doing the hard work now to build software depth, enterprise relationships, and differentiated positioning that sits above raw compute. The ones treating their current market position as durable because GPU demand remains strong are reading today’s market as if it will persist unchanged. It will not. The window to build something defensible is open. It will not stay open indefinitely.