The future of AI infrastructure is being shaped by a quiet but consequential split: training versus inference.
Training large models demands massive, power-dense campuses, often located in remote, energy-rich regions. Inference workloads- the engines behind real-time applications, pull infrastructure in the opposite direction, toward users, networks, and urban demand centers. This divergence is giving rise to two distinct data center archetypes, each with its own requirements for power, cooling, and siting.
As inference begins to overtake training as the dominant AI workload, hyperscalers are being forced to rethink their infrastructure strategies, balancing scale, speed, and resilience under mounting energy constraints.
This analysis draws on a collaborative research effort led by Chhavi Arora, Marc Sorel, and Pankaj Sachdeva, with contributions from Arjita Bhan, Jess He, Nicholas Shaw, Riya Garg, and Shriya Ravishankar, reflecting perspectives from McKinsey’s Technology, Media, and Telecommunications Practice.
What’s changing isn’t just scale; it’s the nature of AI workloads themselves.
Two Workloads, Two Infrastructure Logics
AI computing today revolves around two fundamentally different tasks: training and inference.
Training is where models are built and refined. It requires enormous power densities, with racks often exceeding 100 to 200 kilowatts, supported by advanced networking and liquid cooling. Because training jobs are not latency-sensitive, hyperscalers can site these campuses far from population centers, prioritizing access to land, water, and large blocks of grid capacity.
Inference tells a different story. This is the phase where trained models are deployed to serve users: powering search, chatbots, recommendation engines, and real-time decision-making. Inference racks typically draw between 30 and 150 kilowatts and can often run on older or repurposed hardware. Unlike training, inference is tightly linked to revenue and demands high availability and ultra-low latency.
As inference scales, it is becoming the primary driver of AI infrastructure planning. Research suggests that by 2030, inference will account for more than half of all AI compute and roughly 30 to 40 percent of total data center demand. The shift from episodic training bursts to continuous, revenue-critical inference has profound implications for how and where data centers are built.
Reliability, in this context, is nonnegotiable. Many new AI-focused facilities are being designed with full 2N redundancy, ensuring complete backup for every critical system. In inference-heavy environments, downtime translates directly into lost revenue and degraded user experience.
The physical demands of training and inference infrastructure diverge sharply.
Some next-generation training systems are approaching power densities of nearly one megawatt per rack, relying on tightly synchronized clusters of GPUs or specialized accelerators. These facilities require oversized electrical systems, fast-response battery backups, and highly sophisticated cooling. During training cycles, GPU loads can swing by 30 to 60 percent in milliseconds, forcing data centers to absorb sudden electrical shocks without interruption.
Inference environments, by contrast, are more modular and distributed. Tasks can be broken into smaller units and processed independently, making inference better suited to networked, geographically dispersed architectures. While still far more power-intensive than traditional cloud workloads, inference facilities increasingly resemble enhanced cloud data centers rather than classic high-performance computing sites.
This split is pushing hyperscalers toward two parallel design models: one optimized for extreme power density, and another built around speed, responsiveness, and proximity to users.
Cloud Campuses Are Being Rewired
As inference workloads grow, hyperscalers are reshaping their existing cloud campuses. Around 70 percent of new core campuses now host both general cloud compute and AI inference, often separated by data halls or buildings within the same site. Rather than isolating AI systems, operators are embedding inference clusters deep within established campuses to keep them close to storage, networking, and applications.
This shift is redrawing traditional data center layouts. Inference racks are placed closer to access points to minimize latency, while training systems remain more centralized. Smaller, interconnected facilities, linked by high-speed networks, are becoming more common, particularly as inference moves closer to the edge to reduce response times and bandwidth demand.
At the same time, hyperscalers are accelerating the adoption of power-efficient hardware, including custom silicon, neural processing units, and ARM-based architectures, to extract more performance from every watt.
Power Is Now the Primary Constraint
If one factor dominates hyperscaler expansion today, it is access to electricity. Time to power has become the industry’s most acute bottleneck. Just a few years ago, data centers in unconstrained markets could come online within 12 to 18 months. In heavily saturated regions like northern Virginia, timelines now stretch beyond three years.
Tier 1 hubs, such as northern Virginia and Santa Clara still account for roughly 30 percent of U.S. data center capacity. But grid congestion, lengthy permitting processes, and land prices exceeding $2 million per acre are pushing hyperscalers to look elsewhere.
Tier 2 markets including Des Moines, San Antonio, and Columbus are emerging as viable alternatives. In these regions, power can often be delivered one to two years faster, and land costs can be up to 70 percent lower. As a result, hyperscalers are increasingly adopting power-first site selection strategies, working directly with utilities and state authorities to secure energy before committing to construction.
Capital Models Are Shifting Too
The scale and cost of AI infrastructure are reshaping how hyperscalers finance growth. While smaller facilities are often self-funded, multi-gigawatt campuses increasingly rely on joint ventures with infrastructure funds, utilities, and private credit providers. With build costs reaching as high as $25 million per megawatt, speed and capital efficiency are equally critical.
These partnerships unlock funding but introduce new complexity. Aligning incentives, allocating risk, and coordinating with utilities can slow early-stage development. To offset these delays, some developers are turning to behind-the-meter solutions such as fuel cells, microgrids, mobile gas turbines, and even small modular reactors.
Examples are already taking shape. APR Energy is deploying more than 100 megawatts of mobile gas turbines for a U.S. hyperscaler, while Active Infrastructure is planning a large northern Virginia campus built around hydrogen fuel cells, battery storage, and on-site generation.
In this environment, access to entitled land and dependable power has become a decisive competitive advantage.
Five Strategic Shifts in Hyperscaler Playbooks
As AI demand accelerates, hyperscalers are adjusting their strategies in five key ways.
First, they are becoming active participants in the energy ecosystem, investing directly in renewables, storage, and next-generation nuclear to secure long-term supply.
Second, ownership models are becoming more flexible. Lease-to-own structures now account for roughly 25 to 30 percent of new Tier 1 deals, enabling faster capacity acquisition while preserving long-term control.
Third, modular and prefabricated construction is gaining momentum. Standardized designs and preapproved powered shells can cut delivery timelines by up to 50 percent and are increasingly built to support liquid cooling and high-density AI racks from day one.
Fourth, hyperscalers are consolidating scattered sites into large, multi-building campuses. By 2030, these clustered developments are expected to represent around 70 percent of deployments, improving both operational efficiency and resilience.
Finally, retrofitting has emerged as a critical growth lever. Upgrading legacy data centers, through liquid cooling, structural reinforcement, and substation expansion, is often faster and less risky than new construction, while preserving access to key Tier 1 network hubs.
AI as the New Center of Gravity
AI has become the force reshaping every layer of digital infrastructure. The shifts underway, across power markets, construction methods, financing models, and geography, are not incremental. They represent a structural reset. For stakeholders across the value chain, adapting to this reality will be essential to capturing the next wave of opportunity.
