How In-Row CDUs and Centrifugal Pumps Are Saving the Megawatt Rack

June 29, 2026
Power & Energy Grid
World
Karan Shah

Share the Post:

The artificial intelligence infrastructure industry is approaching one of its most significant mechanical engineering transitions since hyperscale cloud computing emerged nearly two decades ago. Until recently, improvements in server cooling largely focused on refining airflow management, optimizing computer room air handlers (CRAHs), and increasing fan efficiency inside increasingly dense racks. That design philosophy is rapidly becoming obsolete as AI workloads fundamentally alter the thermal profile of modern compute infrastructure. The latest AI platforms now push rack power densities beyond 100 kilowatts (kW), and next-generation AI deployments are likely to reach 250 kW before advancing toward 500 kW to 600 kW in highly integrated AI systems.At these power levels, conventional air cooling cannot remove heat efficiently enough to maintain processor performance and hardware reliability.

As a result, liquid cooling is transitioning from an optional high-performance feature into a foundational requirement for AI infrastructure. This shift is driving unprecedented demand for Coolant Distribution Units (CDUs), direct-to-chip liquid cooling systems, heat exchangers, and precision centrifugal inline pumps that form the mechanical backbone of next-generation data centers. Rather than viewing cooling as a supporting facility service, operators increasingly regard thermal management as a core determinant of compute capacity, energy efficiency, and infrastructure scalability.

The End of Air Cooling

For nearly thirty years, air served as the primary medium for removing heat from enterprise servers. Cold aisle containment, raised floors, perforated tiles, high-volume CRAHs, and sophisticated airflow management became standard components of data center architecture because processor heat loads remained within ranges that airflow systems could manage economically. Even during the early hyperscale era, facilities operating racks between 10 kW and 20 kW successfully relied on increasingly efficient air distribution strategies combined with intelligent building management systems. However, AI has fundamentally changed the thermal equation. Engineers deploy dozens of modern graphics processing units (GPUs) for large language models within a single rack, and each GPU consumes several times more power than a traditional CPU.

The resulting thermal output has increased faster than airflow engineering can realistically compensate. Simply increasing fan speeds or delivering larger volumes of conditioned air no longer provides a practical solution because the physical properties of air limit its capacity to transport heat. Consequently, the physics of heat transfer now constrains thermal engineering more than facility design does.

The transition away from air cooling is not merely a response to higher temperatures but also to higher heat flux. AI accelerators concentrate enormous amounts of thermal energy into relatively small silicon surfaces, creating localized hotspots that conventional airflow cannot dissipate efficiently. Manufacturers now design high-performance processors with thermal design power ratings ranging from hundreds to thousands of watts per device, and system architects integrate these processors into tightly packed server architectures to maximize compute density. Air cooling struggles to maintain consistent junction temperatures under these conditions because thermal resistance increases as heat density rises.

The consequence is thermal throttling, where processors intentionally reduce operating frequencies to protect hardware from overheating. Thermal throttling directly affects AI training throughput, inference latency, and infrastructure utilization, making cooling performance an operational concern rather than simply a facilities issue. Infrastructure operators therefore increasingly evaluate cooling systems based on their ability to sustain continuous processor performance instead of merely maintaining acceptable room temperatures. This change represents one of the most important shifts in modern data center engineering.

The economic implications are equally significant. AI infrastructure investments frequently exceed hundreds of millions of dollars for individual facilities, with GPU hardware representing the largest capital expenditure inside many deployments. Underutilizing those processors because of inadequate cooling directly reduces return on investment. Consequently, mechanical infrastructure has moved from a supporting operational expense to a strategic investment category. Data center designers increasingly recognize that every kilowatt of compute deployed must be matched by sufficient thermal removal capacity throughout the facility. This relationship has elevated pumps, heat exchangers, coolant piping, sensors, valves, and CDU control systems into mission-critical infrastructure components. Whereas previous generations of data centers primarily differentiated themselves through network connectivity or electrical redundancy, AI facilities are increasingly distinguished by the sophistication of their liquid cooling architectures. Mechanical engineering is therefore becoming as important to AI infrastructure competitiveness as processor selection or networking performance.

The Rise of the Megawatt Rack

Rack density has become one of the defining metrics of modern AI infrastructure. Traditional enterprise environments typically operated between 5 kW and 15 kW per rack, while hyperscale cloud deployments gradually expanded toward 30 kW and 40 kW as virtualization increased hardware utilization. The AI era has accelerated this trend dramatically. Systems designed around NVIDIA Blackwell, AMD Instinct, and other advanced AI accelerators require significantly greater electrical power because they integrate larger numbers of GPUs, high-bandwidth memory, ultra-fast networking, and increasingly complex power delivery systems within a single rack. Several infrastructure vendors are now designing facilities capable of supporting racks exceeding 100 kW as standard deployments, while engineering roadmaps anticipate much higher densities during the coming product generations.

These developments are forcing architects to reconsider virtually every aspect of facility engineering, including electrical distribution, structural loading, cooling infrastructure, and maintenance procedures. The concept of the “megawatt rack” is therefore emerging not as marketing terminology but as a realistic planning assumption for future AI factories. One important distinction is that rack density no longer scales linearly with cooling complexity. Doubling compute density often requires disproportionately greater investment in fluid management, redundancy, monitoring, and thermal control systems. Higher heat loads increase pressure requirements throughout cooling loops while simultaneously demanding tighter control over coolant temperature, flow stability, and system reliability. Consequently, the mechanical infrastructure supporting AI racks has become significantly more sophisticated than previous generations of data center cooling systems.

Modern facilities increasingly deploy dedicated liquid distribution networks designed specifically for AI clusters rather than integrating them into conventional building cooling infrastructure. This separation improves operational resilience because internal coolant conditions remain stable even when facility water temperatures fluctuate. As rack densities continue increasing, this architectural approach is becoming the preferred design philosophy among hyperscale operators and AI infrastructure vendors. The evolution from room-level cooling toward rack-level fluid engineering represents one of the defining characteristics of next-generation AI data centers.

Why Liquid Cooling Became Inevitable

The transition from air cooling to liquid cooling is not simply the result of higher processor power consumption. Rather, it reflects a fundamental change in the thermal characteristics of AI hardware itself. Modern AI accelerators pack billions of transistors onto increasingly compact silicon packages while simultaneously operating at substantially higher power levels than conventional enterprise processors. Every additional watt consumed by a processor ultimately becomes heat that must be removed continuously to maintain reliable operation. As chip manufacturers increase transistor density, memory bandwidth, and interconnect speeds, heat generation rises faster than traditional airflow systems can dissipate it. This challenge becomes even more pronounced inside AI servers, where multiple GPUs, CPUs, high-bandwidth memory (HBM), and networking switches operate within confined physical spaces.

Under these conditions, maintaining stable silicon temperatures requires a cooling medium with significantly greater thermal conductivity than air. Liquid, particularly water-based coolant mixtures, can transfer heat thousands of times more efficiently by volume than air, making direct liquid cooling the most practical solution for next-generation AI infrastructure. The shift is also driven by energy efficiency objectives. Traditional air cooling requires enormous volumes of conditioned air to circulate continuously throughout the data hall. Computer Room Air Handlers (CRAHs), Computer Room Air Conditioning (CRAC) units, server fans, and containment systems collectively consume a meaningful share of facility power. As rack densities increase beyond 100 kW, the energy required to move sufficient airflow rises disproportionately, reducing overall data center efficiency. Liquid cooling changes this equation by transporting thermal energy directly away from processors before it disperses into the surrounding environment.

Because liquids possess far greater heat capacity than air, significantly smaller flow volumes can remove equivalent thermal loads. This allows facilities to reduce dependence on large air-handling infrastructure while improving Power Usage Effectiveness (PUE). More importantly, lower processor temperatures enable AI accelerators to sustain boost frequencies for longer periods without thermal throttling, increasing compute utilization and improving return on investment. Consequently, liquid cooling delivers both operational efficiency and higher compute productivity, making it an increasingly attractive economic proposition for hyperscale AI deployments. Another factor accelerating adoption is the evolution of processor packaging. Previous generations of servers primarily cooled CPUs using relatively simple heat sinks and airflow channels. Today’s AI platforms integrate multiple GPUs, high-bandwidth memory stacks, PCIe switches, advanced networking interfaces, and specialized AI accelerators within densely packed chassis.

These components produce uneven thermal profiles, creating localized hotspots that conventional airflow struggles to address effectively. Direct-to-chip liquid cooling eliminates much of this challenge by placing cold plates directly onto heat-producing components. Coolant circulates through precisely engineered microchannels inside these cold plates, absorbing thermal energy almost immediately after it leaves the silicon surface. This approach minimizes thermal resistance while providing highly consistent cooling across multiple devices. It also enables designers to build increasingly dense compute systems without being constrained by airflow limitations. As AI models continue expanding in complexity and compute requirements, direct liquid cooling is rapidly becoming the preferred thermal architecture across high-performance computing and enterprise AI deployments.

The economics reinforce the engineering case. AI clusters powered by thousands of advanced GPUs often represent infrastructure investments measured in hundreds of millions of dollars. Every percentage point of reduced utilization translates into substantial financial consequences. Thermal throttling, unplanned downtime, or cooling inefficiencies therefore affect not only operational performance but also capital productivity. Data center operators increasingly recognize that cooling infrastructure directly influences revenue generation because it determines how consistently expensive AI hardware can operate at peak performance. Consequently, spending more on sophisticated liquid cooling systems often produces stronger long-term financial returns than relying on lower-cost air-based alternatives that limit compute output. This perspective marks a significant departure from previous generations of facility engineering, where cooling was frequently viewed as a supporting operational expense rather than a strategic investment. Today, thermal management is becoming one of the primary enablers of AI economics.

The Engineering Anatomy of a Coolant Distribution Unit (CDU)

Although GPUs and liquid-cooled cold plates receive considerable attention, the Coolant Distribution Unit has quietly become one of the most important mechanical systems inside modern AI data centers. A CDU functions as the central interface between the facility’s primary water infrastructure and the secondary liquid cooling loop serving IT equipment. Rather than allowing building water to circulate directly through sensitive servers, the CDU isolates the two environments using heat exchangers, pumps, valves, sensors, and sophisticated control systems. This separation protects expensive computing hardware from contamination, pressure fluctuations, corrosion, and variations in facility water quality. It also enables operators to optimize coolant chemistry specifically for electronic components without affecting the building’s larger mechanical systems. As AI deployments become increasingly dense, the CDU evolves from a supporting mechanical device into the operational heart of liquid-cooled infrastructure. Without reliable CDU operation, direct-to-chip cooling cannot maintain the stable temperatures required for sustained AI workloads.

Modern CDUs perform several critical engineering functions simultaneously. Heat exchangers transfer thermal energy from the secondary coolant loop into the facility water loop without allowing the fluids to mix. High-efficiency centrifugal inline pumps circulate coolant continuously through server cold plates, maintaining precise flow rates required for effective heat removal. Variable-frequency drives automatically adjust pump speed according to workload intensity, reducing unnecessary energy consumption during periods of lower compute demand. Sensors continuously monitor pressure, temperature, flow rate, and coolant quality, feeding data into intelligent control systems capable of responding in real time to changing thermal conditions. Many enterprise-grade CDUs also incorporate redundant pumps, automatic bypass valves, leak detection systems, and predictive maintenance capabilities designed to maximize operational resilience. These integrated functions transform the CDU into an intelligent thermal management platform rather than a simple pumping station.

The increasing sophistication of CDU design reflects the industry’s recognition that cooling reliability directly affects AI infrastructure availability. One of the defining characteristics of modern CDU architecture is hydraulic stability. AI processors require consistent coolant flow regardless of fluctuations elsewhere in the facility. Variations in pressure or flow can reduce cooling efficiency, increase component temperatures, and ultimately affect processor performance. CDUs prevent these issues by creating hydraulically isolated secondary loops dedicated exclusively to IT equipment. Internal pumps maintain carefully controlled flow conditions while plate heat exchangers absorb thermal energy before transferring it into the facility’s primary cooling network. Because the two systems remain physically separated, changes occurring within building chillers or cooling towers have minimal impact on server cooling performance. This decoupled architecture greatly improves operational resilience while simplifying maintenance procedures.

Primary vs. Secondary Cooling Loops: Why Separation Matters

One of the most misunderstood aspects of liquid-cooled AI infrastructure is the distinction between primary and secondary cooling loops. The primary loop typically carries facility water supplied by chillers, dry coolers, or cooling towers throughout the building. This water is optimized for large-scale thermal transport rather than direct contact with sensitive electronic equipment. Variables such as water chemistry, pressure, particulate contamination, and seasonal temperature fluctuations make facility water unsuitable for direct circulation through server cold plates. Consequently, CDUs establish an isolated secondary loop filled with carefully conditioned coolant formulated specifically for electronic cooling applications. This secondary circuit maintains stable temperatures, controlled pressure, and consistent fluid quality regardless of changes occurring elsewhere in the mechanical plant. The result is significantly improved reliability for high-density AI infrastructure operating under continuous computational load.

This separation also enhances scalability. As organizations deploy additional AI racks, engineers can expand secondary cooling networks without fundamentally redesigning the facility’s primary mechanical infrastructure. New CDUs can be integrated incrementally while maintaining consistent hydraulic performance across existing deployments. This modular approach aligns closely with hyperscale expansion strategies, where compute capacity is added progressively rather than through complete facility redesigns. It also simplifies maintenance because individual cooling loops can often be isolated without affecting neighboring clusters. As AI factories continue growing toward multi-megawatt deployments, modular liquid cooling architectures built around independent CDUs are increasingly becoming the preferred engineering model. Rather than treating cooling as centralized building infrastructure, operators now design thermal management systems that scale alongside compute capacity itself. This evolution illustrates how mechanical engineering is becoming inseparable from AI infrastructure strategy.

Coolant Chemistry: The Invisible Engineering Layer Behind AI Infrastructure

As liquid cooling becomes the dominant thermal management strategy for AI infrastructure, the coolant itself is emerging as a critical engineering consideration rather than a simple heat-transfer medium. While discussions often focus on cold plates, pumps, and Coolant Distribution Units (CDUs), the chemical properties of the circulating fluid directly influence system reliability, heat transfer efficiency, equipment longevity, and maintenance requirements. Modern AI data centers cannot simply circulate untreated water through direct-to-chip cooling systems because water chemistry changes continuously through corrosion, dissolved oxygen, biological contamination, mineral scaling, and particulate accumulation. Even microscopic contaminants can reduce thermal conductivity, clog narrow microchannels inside cold plates, and accelerate corrosion across expensive server components. Consequently, operators formulate secondary cooling loops using highly purified water combined with corrosion inhibitors, biocides, buffering agents, and conductivity stabilizers specifically engineered for electronic cooling applications.

These formulations maintain consistent thermal performance while protecting copper, aluminum, stainless steel, elastomers, and other materials found throughout liquid cooling systems. As rack densities continue increasing beyond 250 kilowatts, coolant chemistry is becoming as strategically important as hydraulic engineering because fluid degradation directly affects long-term infrastructure reliability. The importance of coolant quality increases significantly as modern cold plates become more sophisticated. Many direct-to-chip cooling systems rely on extremely narrow microchannel structures machined into copper cold plates to maximize heat transfer between the processor package and circulating coolant. These channels are often only fractions of a millimeter wide, allowing turbulent flow to absorb thermal energy with exceptional efficiency. However, this design also makes the cooling system more sensitive to particulate contamination, scaling, and biological growth than conventional industrial piping networks.

Small deposits that would be insignificant inside commercial HVAC systems can restrict coolant flow inside AI servers, reducing localized heat transfer and increasing processor temperatures. Infrastructure operators therefore implement comprehensive water treatment programs that include continuous filtration, conductivity monitoring, chemical testing, and scheduled coolant replacement. Some facilities also deploy automated fluid monitoring systems capable of detecting changes in pH, dissolved oxygen, and contamination levels before cooling performance begins to decline. Maintaining coolant quality has consequently become a routine operational responsibility alongside electrical maintenance and network monitoring within modern AI data centers.

Direct-to-Chip Versus Immersion Cooling: Two Different Paths Toward the Same Objective

Although direct-to-chip liquid cooling currently dominates enterprise AI deployments, immersion cooling continues to attract growing interest as rack densities increase toward several hundred kilowatts. Both technologies seek to solve the same engineering problem by replacing air with liquid as the primary heat transfer medium, yet they accomplish this objective using fundamentally different architectural approaches. Direct-to-chip cooling removes heat by circulating coolant through cold plates attached directly to processors, memory modules, and occasionally networking components. The remaining server components continue operating within conventional air-cooled chassis environments, simplifying hardware maintenance while minimizing changes to existing server designs. This hybrid architecture has enabled relatively rapid adoption because server manufacturers can adapt existing platforms without completely redesigning their products. Consequently, most current AI infrastructure deployments favor direct-to-chip cooling due to its compatibility with mainstream enterprise hardware and established operational procedures.

Immersion cooling follows a more comprehensive thermal strategy by submerging entire servers within electrically non-conductive dielectric fluids capable of absorbing heat directly from every component. Fans become unnecessary because heat transfers naturally into the surrounding liquid before moving toward external heat exchangers. This approach offers exceptional thermal performance while reducing mechanical complexity inside servers themselves. Immersion systems can also accommodate extremely high rack densities because virtually every heat-generating component benefits from direct liquid contact. However, widespread enterprise adoption remains relatively limited due to operational considerations involving maintenance procedures, equipment compatibility, fluid management, and ecosystem maturity. Servicing immersed hardware requires specialized handling processes that differ significantly from traditional data center operations, and not all commercial server platforms are optimized for immersion environments.

Consequently, industry analysts generally expect direct-to-chip cooling to dominate mainstream AI deployments during the near term, while immersion cooling expands within specialized hyperscale and high-performance computing environments requiring maximum thermal efficiency. Both technologies will likely coexist rather than compete directly because each addresses different operational priorities and deployment scenarios. Several major infrastructure suppliers have significantly expanded their liquid cooling portfolios during the past two years in response to accelerating AI investment. Companies including Vertiv, Schneider Electric, nVent, Motivair, Boyd, CoolIT Systems, and Delta Electronics now offer increasingly specialized CDU platforms supporting hyperscale, enterprise, and colocation deployments. Meanwhile, semiconductor vendors such as NVIDIA continue collaborating closely with thermal infrastructure manufacturers to validate reference architectures capable of supporting next-generation AI systems.

These partnerships highlight an important industry shift. Cooling infrastructure is no longer developed independently from compute hardware; instead, processors, servers, racks, power systems, networking equipment, and liquid cooling architectures are increasingly engineered as integrated platforms. This systems-level approach improves deployment consistency while reducing integration complexity for customers constructing large AI facilities. As infrastructure becomes more tightly integrated, mechanical vendors are assuming increasingly strategic positions within the broader AI supply chain. Their products directly influence the achievable performance, efficiency, and scalability of future AI factories.

Beyond 600 kW: Engineering the Next Generation of AI Factories

Although today’s highest-density AI racks already exceed historical design assumptions, industry roadmaps indicate that thermal engineering challenges will continue intensifying throughout the remainder of the decade. Processor manufacturers are increasing transistor counts, memory capacity, interconnect bandwidth, and accelerator integration at a pace that consistently raises rack-level power requirements. Simultaneously, enterprises seek larger AI clusters capable of training increasingly sophisticated foundation models while supporting enterprise-scale inference workloads. These trends suggest that future AI factories may eventually operate rack densities approaching or even exceeding one megawatt under certain specialized configurations. Achieving these performance levels will require substantial advances not only in semiconductor technology but also in facility engineering, fluid dynamics, electrical distribution, and thermal control systems. Cooling infrastructure must therefore evolve continuously alongside compute hardware rather than reacting after new processors enter production. This synchronized engineering approach is becoming one of the defining characteristics of modern AI infrastructure development.

Future facilities are also expected to rely increasingly on intelligent automation. Artificial intelligence will likely optimize its own cooling infrastructure through predictive analytics capable of adjusting pump speeds, coolant temperatures, flow balancing, and maintenance schedules in real time. Digital twins representing complete thermal systems are already being adopted to simulate facility performance before physical modifications occur. Machine learning algorithms can analyze sensor data collected from pumps, valves, heat exchangers, and CDU controllers to identify inefficiencies or potential failures before they affect production workloads. These capabilities will become increasingly valuable as AI campuses expand toward hundreds of thousands of accelerators operating simultaneously. Mechanical infrastructure is therefore evolving from static equipment into software-defined systems capable of adapting dynamically to changing operational conditions.

Conclusion

The global race to build AI infrastructure is often portrayed as a competition centered on GPUs, semiconductor manufacturing, and electrical power generation. Those elements remain essential, yet they represent only part of the engineering challenge. Every additional watt consumed by increasingly powerful AI processors ultimately becomes heat that must be removed with exceptional precision and reliability. As rack densities advance toward 250 kilowatts, 600 kilowatts, and eventually beyond, thermal management becomes inseparable from compute performance itself. Coolant Distribution Units, centrifugal inline pumps, direct-to-chip cooling systems, and advanced hydraulic architectures are no longer secondary mechanical components operating behind the scenes. They have become strategic infrastructure determining how much AI compute can actually be deployed, sustained, and economically operated. The future competitiveness of AI factories will therefore depend not only on faster processors but also on the efficiency, resilience, and intelligence of the liquid cooling ecosystems supporting them.