Jacob Yundt, Senior Director of Compute Architecture at CoreWeave, highlights one of the overarching challenges in today’s AI data center: Cooling. Keeping components in high performance deployments operating at appropriate temperatures is paramount. Failure to do so can cause throttling, performance issues, and can even shorten component lifetimes.
CoreWeave’s solution to this challenge is its aggressive adoption of liquid cooling. Air simply does not have the cooling ability needed for these high-powered, highly dense racks. With its commitment to native direct-to-chip liquid cooling for its GPU products, it’s only natural that the Solidigm direct-to-chip liquid-cooled SSD is becoming instrumental in liquid-cooling builds.
Solidigm is pushing the boundaries of data storage with its highly dense SSDs, using CSAL software and liquid cooling to optimizes them. CoreWeave is able to leverage those storage innovations in its cloud for AI deployments. In collaboration with Solidigm, CoreWeave provided input into defining standards to continue to scale, leveraging those standards to aggressively bring new products to its customers.
One of the reasons that we like the Solidigm liquid-cooled SSDs are that we can have these highly performant, highly resilient solutions with liquid cooling so that we can achieve that overall density. Solidigm has been fantastic to collaborate with.
CoreWeave is a specialized cloud provider that focuses on accelerated compute. We are the essential cloud for AI. Our systems are fighting the laws of physics.
My name is Jacob Yundt, and I'm the Senior Director of Compute Architecture at CoreWeave.
AI has fundamentally changed the needs of infrastructure. You now need performance at an unprecedented scale. When you have hundreds of thousands of GPUs operating in unison, you need every single component to be working perfectly.
Some of the challenges of AI architectures is that you have a fixed amount of budget to power each server and if you spend a large portion of that budget on fans that means that there is less power available for GPUs, which means less GPUs available to the customer. Air cooling just simply will not work.
Because there's not enough cooling ability for the high-power demands of these GPU racks, you literally cannot move enough air over these heat-generating components to have them operate at the appropriate temperatures. To have them not throttle. To make sure the lifespan the components is good. And that results in performance implications to your customers.
One of the reasons that customers use the CoreWeave cloud is that we guarantee an incredibly performant experience and we can't deliver on that if these components are throttling due to thermal reasons. CoreWeave is aggressively adopting liquid cooling. When we saw the various road maps in 2024, we realized that liquid cooling was the only way forward for some of these products. We adjusted our data center plans to only offer native direct-to-chip liquid cooling for our GPU products. And that had material impact because that allowed us to be first to market with GB 200 first; to market with GB 300; because all of our data centers were ready to support native liquid cooling solutions.
One of the ways that Solidigm is leading the space for direct-to-chip liquid cooling of SSDs is by helping define the standards. That allows us to continue to scale, that allows us to be first to market, that allows us to leverage those open standards so that we can aggressively bring new products to our customers.
One of the reasons that we like the Solidigm liquid-cooled SSDs are that we can have these highly performant, highly resilient solutions with liquid cooling so that we can achieve that overall density. Solidigm has been fantastic to collaborate with. We leverage open standards as much as possible and it's incredibly rewarding to work with Solidigm because I know that they will help push these open standards forward.