
Modern cloud data centers not only consume an immense amount of power for computing and cooling, but also a significant amount of water, as most use evaporative liquid cooling.
By contrast, Nvidia's GB200 NVL72 and GB300 NVL72 machines utilize direct-to-chip liquid cooling systems, which are claimed to be 25 times more energy-efficient and 300 times more water-efficient than today's coolers. However, there is a catch, as NVL72 rack-scale systems consume over seven times more power than typical racks.
Typical data center server racks consume around 20kW of power, whereas Nvidia's H100-based racks consume over 40kW of power. However, Nvidia's GB200 NVL72 and GB300 NVL72 rack-scale systems consume 120kW – 140kW of power, outpacing the vast majority of racks already installed.
As a result, air-based cooling methods are no longer sufficient to manage the thermal loads produced by these high-density racks. Therefore, Nvidia had to adopt a new cooling solution for its Blackwell machines, which led to the development of a new solution.
Nvidia's GB200 NVL72 and GB300 NVL72 systems use direct-to-chip liquid cooling. This approach involves circulating coolant directly through cold plates attached to GPUs, CPUs, and other heat-generating components, efficiently transferring heat away from these devices without relying on air as the intermediary.
Unlike evaporative cooling or immersion cooling, NVL72's liquid cooling is a closed-loop system, so coolant does not evaporate or require replacement due to loss from phase change, saving water.
In the NVL72 architecture, the heat absorbed by the liquid coolant is then transferred to the data center's cooling infrastructure through rack-level liquid-to-liquid heat exchangers. These coolant distribution units (CDUs), such as the CoolIT CHx2000, are capable of managing up to 2 mW of cooling capacity, supporting high-density deployments with low thermal resistance and reliable heat rejection.
Additionally, this setup enables the systems to operate with warm water cooling, thereby reducing or eliminating the need for mechanical chillers, which improves both energy efficiency and water conservation.
There are several things to note about Nvidia's closed-loop direct-to-chip liquid cooling solutions. Although closed-loop liquid cooling solutions are widely used by PC enthusiasts, there are several practical, engineering, and economic reasons why these systems are presently not widely adopted at scale.
Data centers require modularity and accessibility for maintenance, upgrades, and component replacement, which is why they use hot-swappable components. However, hermetically sealed systems make the quick replacement of failed servers or GPUs difficult, as breaking the seal would compromise the entire cluster.
Also, routing sealed liquid loops across racks and the entire data center introduces logistical complexity in piping, pump redundancy, and failure isolation. Fortunately, current direct-to-chip liquid cooling solutions use quick-disconnect fittings with dripless seals, which offer serviceability without full hermetic sealing (at the end of the day, detecting and isolating leaks quickly is cheaper than creating a fully hermetic data center-scale solution). However, using data center-scale liquid cooling still requires redesign of the whole data center, which is expensive.
Nonetheless, since Nvidia's Blackwell processors offer unbeatable performance, adopters of B200 GPUs are willing to invest in such redesigns. Additionally, it is worth noting that Nvidia has co-developed reference designs with Schneider Electric for 1152 GPU DGX SuperPOD GB200 clusters, utilizing Motivair liquid-to-liquid CDUs and fluid coolers with adiabatic assist. This enables the quick deployment of such systems with maximized efficiency.
Although Nvidia mandates the use of liquid cooling with its Blackwell B200 GPUs and systems, the company has invested in reference designs of sealed liquid cooling solutions to avoid the use of evaporative liquid cooling solutions, in an effort to preserve water, which seems to be a reasonable tradeoff.
Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.