Decentralizing AI with a Liquid-Cooled Development Platform by Supermicro and NVIDIA

Photo of hardware system from Supermicro.

AI is the topic of conversation around the world in 2023. It is rapidly being adopted by all industries including media, entertainment, and broadcasting. To be…

AI is the topic of conversation around the world in 2023. It is rapidly being adopted by all industries including media, entertainment, and broadcasting. To be successful in 2023 and beyond, companies and agencies must embrace and deploy AI more rapidly than ever before. The capabilities of new AI programs like video analytics, ChatGPT, recommenders, speech recognition, and customer service are far surpassing anything thought possible just a few years ago.

However, according to research, less than half of companies or agencies are successfully deploying AI applications due to cost. The other half are scrambling to determine how exactly they can harness this somewhat mysterious and new software that promises to provide a competitive advantage throughout every industry in the world.

In April 2023, Supermicro launched a new system to help expedite the deployment of AI workloads for developers, new adopters, as well as established users. The liquid-cooled AI development platform is called Supermicro SYS-751GE-TNRT-NV1 and there is nothing like it available in the world today.

The hardware and software system comes with the full suite of NVIDIA AI Enterprise frameworks, models, and tools and the Ubuntu 22.04 operating system. The beauty of this new and revolutionary system is that it decentralizes AI development at an entry-level cost point far cheaper than a large supercomputer.

Normally, researchers must book time slots to use the supercomputer and wait in the queue.
They run an application (machine learning training, for example) and receive results.
When they make changes in the software, they must run the training again by booking another time slot and waiting in the queue.

This is all too time-consuming. It takes too long to get the desired results and increases the overall total cost of ownership (TCO).

With the new AI development platform, all these issues are resolved and the TCO goes down substantially. You can run ML tests, get the results quickly, and run the tests again without waiting. With the proximity of the new system to the actual AI developer, latency is lowered, which can be critically important for many AI workloads.

Optimized hardware for AI enterprise software

The technology that makes this Supermicro product unique is the ability to liquid-cool this solution. The internal closed-loop radiator and cooling system that is super-quiet, extremely energy-efficient, and less expensive than most AI hardware. It puts out virtually no heat.

In addition to this new revolutionary hardware technology, the AI development platform is perfectly optimized for the included downloadable NVIDIA AI Enterprise software programs. This includes over 50 workflows, frameworks, pretrained models, and infrastructure optimization that can run on VMware vSphere.

Most importantly, this AI development platform is literally plug-and-play. Take it out of the box, turn it on, connect to the network, download the included AI software of your choice, and start running those AI applications!

The technical advancement here is the perfect pairing and optimization of hardware systems to specific NVIDIA AI Enterprise software applications, maximizing the software capabilities to capitalize on the intrinsic advantages of AI.

Optimizing the Supermicro hardware to the unique requirements of NVIDIA AI Enterprise software applications removes all questions about how much memory you need, how many GPUs are needed, or what kind of processors must be installed. Frankly, this system just works, right out of the box.

Here are some of the resulting customer benefits:

Cost-effectiveness: With the price point closer to a workstation, you can deploy AI more cost-effectively than ever before, without trying to figure out what technical hardware components are required to run your applications.
Whisper-quiet system: Quieter than many household appliances, it’s perfect for using in a data closet, remote location, under your desk, or even in your home.
Super-powerful system: The platform includes four NVIDIA A100 GPUs and two Intel CPUs that can run any AI application available today.
Lower TCO with a significant energy savings: The self-contained liquid cooling system almost completely cools itself without needing external AC or connections to any building chilled-water system.
Increased security: The platform can be operated in a local data center, with or without relying on the cloud, and it’s secure in either location.
Significant time savings: You can have a dedicated, decentralized system that enables you to run ML tests, get results, and re-run without waiting.

Energy-efficient and quiet cooling

The new AI development platform from Supermicro features a novel liquid-cooling solution offering unmatched performance and customer experience.

The liquid cooling solution is self-contained and invisible to the user. This system can be used like any other air-cooled system and offers a problem-free, liquid-cooling experience for any type of user.

The optimized Supermicro cold plates deliver efficient cooling to two 4th Gen Intel Xeon Scalable CPUs (270 W TDP) and up to four NVIDIA A100 GPUs (300 W TDP).

An N+1 redundant pumping system module moves the liquid through the cold plates to cool the CPUs and GPUs. Its redundancy enables for continuous operation in case of pump failure for high system uptime.

The heat is transferred to the surrounding air with a high-performance radiator coupled with low-power fans.

The innovative liquid cooling system designed by Supermicro effectively cools down the system for less than 3% of its total power consumption against 15% for standard air-cooled products.

Finally, the system operates at an extremely low noise level (~30 dB) at idle, making it perfect for a quiet office environment.

For more information, see Liquid Cooled AI Development Platform.

Source:: NVIDIA