A sharp rise in artificial intelligence workloads is pushing organizations to rapidly reassess how and where they run their most demanding computing workloads. New findings from Deloitte, as noted by Chris Thomas, Ganesh Seetharaman, and Diana Kearns-Manolatos, suggest that the next 12 months could bring some of the biggest infrastructure shifts in years, as enterprises balance performance, cost, and resilience in the age of AI.
Deloitte’s survey of 120 operators across data centers, energy providers, and distributors (conducted between March and April 2025) shows that every major computing environment, from mainframes to public cloud to edge systems is expected to see at least 20% growth in workloads linked to AI activity. These workloads span pretraining, reinforcement learning, reasoning tasks, and large-scale inferencing, especially as agentic AI systems expand.
Emerging AI Clouds and Edge Set for the Sharpest Growth
According to the survey, emerging AI cloud providers are predicted to face the steepest rise in demand (87%), followed by edge computing platforms (78%). These increases significantly outpace traditional on-premises data centers, which are projected to grow at far lower rates.
While workloads on mainframes and enterprise on-prem systems will also increase, nearly one-third of respondents expect to reduce reliance on these environments within the year. Many are shifting capacity to cloud platforms, optimizing existing hardware for AI, or reactivating old facilities to handle compute-intensive needs.
Deloitte notes that enterprises are now experimenting with several modernization strategies, ranging from reconfiguring data centers, to collaborating with hyperscale cloud providers, to building GPU-optimized environments that support high-performance computing (HPC).
Leaders Struggle With New Resilience and Security Demands
The rapid expansion of AI workloads is complicating long-standing infrastructure decisions. Many IT leaders now face questions about balancing on-prem, cloud, and HPC solutions as data volumes explode and architectures grow increasingly distributed.
Ensuring data integrity, low-latency performance, strong security, and fast recovery has become harder as the attack surface widens and more AI systems interact across multiple platforms.
Enterprises also remain uncertain about how quickly AI workloads will grow, making long-term planning challenging.
Hybrid Models Rise: Cloud, Edge, and On-Prem Together
Deloitte’s findings highlight several hybrid approaches gaining traction:
- Burst-first, buy-later: Organizations use spot-priced cloud GPUs for experimentation and only invest in dedicated hardware once utilization becomes stable.
- Selective multicloud: Multiple clouds are used strategically, with sensitive data and real-time inferencing kept on-prem to manage costs.
- Hybrid cloud–edge setups: Common in healthcare, manufacturing, and autonomous systems, this model keeps training in the cloud while running applications locally on advanced processors.
- Air-gapped vertical solutions: Highly regulated sectors prefer isolated on-prem systems for privacy, security, and sovereign AI requirements.
Cost Pressures Push Workload Rebalancing, But Not for All
Cost remains the strongest motivator for shifting workloads away from cloud platforms.
- 55% of respondents say they will begin moving workloads once hosting and compute expenses cross a set threshold.
- 17% cite latency or security as their primary concerns.
- Yet nearly 27% plan to stay on the cloud even if costs rise.
Deloitte warns that many organizations overlook the full cost of AI, which includes not only GPUs and CPUs, but also inferencing expenses measured in AI tokens, networking, storage, and latency management.
Some enterprises are already building dashboards to track usage and identify high-cost patterns across their estate. Others are optimizing inference strategies to control token consumption.
When Will Companies Leave the Cloud?
Nearly one-third of respondents (30%) say they won’t consider moving off the cloud until cloud costs reach 1.5 times the price of alternatives, meaning they expect a 50% savings upfront before switching to owned infrastructure. Deloitte notes this could delay necessary investments in AI performance and competitiveness.
Meanwhile, 24% say they will consider transitioning when alternatives are just 25–50% cheaper, indicating a more proactive approach to long-term total cost of ownership.
Some large enterprises are already moving ahead by building AI factories—data centers specifically designed for AI workloads. Deloitte’s research shows that 15 global telecoms introduced such facilities in 2024.
Power Constraints and Security Top Operator Concerns
Operators also report growing worries about:
- Power and grid capacity limits (70%)
- Cyber and physical security risks (63%)
- Yet 78% believe technological advancements will help ease these pressures over time.
Several developments could reshape GPU deployment patterns, including:
- Localized data residency laws
- Lower GPU prices and better utilization
- Greater use of smaller, distributed GPU servers
- Rising edge-AI needs for sectors like smart cities and autonomous vehicles
- Improved workload mobility tools
- Staffing shortages pushing automation
- Heightened security requirements
Deloitte’s analysis suggests that long-term success will require enterprises to maintain deep visibility into workloads, maintain strong security protocols, and prioritize high-quality data.
