AI Explosion Forcing Data Center Design Shift 'On A Viral Scale'
The artificial intelligence arms race is transforming the data center industry in a multitude of ways, including the way operators tackle one of their top challenges: cooling the facilities.
The chips needed for AI and other increasingly prominent high-performance computing tasks create significantly more heat than traditional servers. Instead of cooling data centers with air, operators are increasingly using liquid refrigerant to cool servers directly.
Liquid cooling has long been discussed in industry circles, but it is suddenly seeing widespread adoption in data centers operated by cloud and social media giants. How quickly liquid cooling takes hold across the rest of the industry remains to be seen, but there is a broad consensus that the shift toward this technology becoming the standard is well underway.
“We see this as kind of happening on a viral scale,” Bijan Nowroozi, Open Compute Project Foundation's chief technology officer, said at Bisnow’s National DICE Data Center Management, Operations and Cooling Series at the Hilton Anatole Dallas hotel earlier this month. “There's so much activity in liquid cooling and so much global interest. It is off the charts.”
Managing the heat produced by thousands of servers and other computing equipment is one of the primary challenges for data center builders and operators. Traditionally, data centers have used chilled air for cooling — essentially enormous, finely tuned air-conditioning systems that in many cases deliver cold air directly to each individual server.
But now data centers are increasingly utilizing liquid-cooling systems, piping liquid refrigerant into cooling plates attached to servers or into plates and pipes integrated into the computing equipment itself. In other cases, specially designed servers are fully immersed in a cooled, nonconductive refrigerant. Once in place, liquid-cooling systems typically remove heat more efficiently than air cooling, generally using less power and water than traditional systems.
Liquid cooling for data centers has been around for decades, but until now, liquid cooling’s use has been limited mainly to highly specific applications like supercomputers. Despite the improved performance liquid cooling offers over traditional systems, data center operators and their tenants have been hesitant to make the switch. And not without reason.
While liquid cooling is cheaper to operate, the systems are more expensive to build and require significant design changes from air-cooled buildings. Often, they require different servers and other data center equipment. There was also widespread uneasiness about the risks of having massive amounts of fluid in a mission-critical environment where a leak could lead to disaster.
Perhaps most significantly, air cooling worked. Changing to liquid cooling meant throwing away a vast body of knowledge about how cooling systems should be run and optimized, retraining experienced teams from scratch and putting new management systems in place. The marginal gains just weren’t worth it.
But now, an AI arms race between tech giants is making liquid cooling a necessity. Following the enormous success of ChatGPT, companies like Microsoft, Google and Meta have poured resources into AI and the digital infrastructure needed to support it.
AI computing requires high-performance processors known as GPUs that use far more power than traditional data center chips. As a result, AI computing creates significantly more heat than traditional data center equipment, exceeding the capabilities of air-cooling systems.
“Chip power is really driving a lot of liquid cooling,” John Sasser, chief technology officer at Sabey Data Centers, said at the DICE event. “To get that sort of performance, you just can't do it with air-cooled chillers. Customers are going to demand that performance. That performance is going to require liquid cooling, and so you're going to see increased adoption in the data center, first for applications like AI.”
Cooling the high-performance computing infrastructure required for AI is already having a significant impact on hyperscale data center development. In December, Meta paused all global data center construction to fundamentally redesign its facilities to be optimized for AI and use liquid cooling. Industry insiders say other tech giants are similarly switching to liquid-cooling designs en masse.
“It’s already happening in the hyperscale space,” Nowroozi said. “Everybody wants to do it: Google wants to do it, Microsoft wants to do it. These are very data-heavy, AI-intensive applications where, ultimately, the compute density is so high it requires liquid cooling.”
While liquid cooling is quickly becoming the new standard in the hyperscale space, there was a general consensus among panelists at Bisnow’s DICE summit that it will take longer for widespread adoption to trickle down to operators and tenants of enterprise and colocation data centers.
“I think we'll see an evolution, but it will not be a revolution,” said Adil Attlassy, chief technology officer at Compass Datacenters. “It will be a slow uptake of liquid cooling — probably five years out, in my view.”
According to Attlassy, the vast majority of workloads being processed in data centers by corporate users don’t need these new high-performance chips at present, nor are IT systems set up to utilize them efficiently. Air cooling still works fine for most of what companies outside of the major cloud and social media providers are doing with their digital infrastructure, he said.
Compass’ largest customers require a mix of liquid and air cooling, but almost 90% is typically still air-cooled, according to Attlassy. He doesn’t expect this ratio to change significantly for another three or four years.
But three or four years isn’t actually that much time in an industry where the development process can span even longer, and companies are planning for the capacity they’ll need in a decade. For colocation operators and companies building data centers on spec, the speed at which liquid cooling becomes the industry norm has real implications for how they design the data centers they are developing today.
Companies are trying to avoid having to choose between meeting the current needs of tenants who largely require air cooling but risking obsolescence and gambling on a liquid-cooling future that may or may not emerge as predicted. But designing for flexibility can mean weighing trade-offs, according to Sabey’s Sasser. He said that while the vast majority of the company’s tenants require air cooling, Sabey switched to a less efficient data center design because it can accommodate liquid cooling as demand emerges.
“We’re driven by what we're seeing in the market. What we're seeing is still mostly air-cooled,” Sasser said. “I do believe that liquid cooling is going to get a lot more traction in the future, and we've made decisions around that, but not to the point of spec building liquid-cooled data centers.”