Nvidia HPC/AI chip is actually six chips

GIXnews

2 days ago

Nvidia’s idea of a “chip” now includes a massive single board the size of a PC motherboard that packs four Blackwell GPUs, two Grace CPUs and a high-speed interconnect to wire it all together and has a power draw of 5.4 kilowatts.

Officially dubbed the GB200 NVL4, it is essentially two GB200 Superchips glued together without the off-board NVLink. The NVLink is only for onboard communication. Data traveling off the board uses a variety of networking protocols, like Ethernet and InfiniBand.

The GB200 NVL4 Superchip is designed for servers running a mix of high-performance computing and AI workloads, said Dion Harris, director of accelerated computing at Nvidia, in a briefing last week with journalists.

The GB200 NVL4 Superchip features 1.3TB of coherent memory that is shared across all four B200 GPUs using NVLink, which offers bidirectional throughput of up to 1.8 TB/s per GPU.

Compared to the GH200 NVL4, the GB200 NVL4 is 2.2 times faster for a simulation workload using MILC code, 1.8 times faster for training the 37-million-parameter GraphCast weather forecasting AI model, and 1.8 times faster for inference on the 7-billion-parameter Llama 2 model using 16-bit floating-point precision.

Nvidia H200 NVL PCIe card

In addition to the GB200 NVL4 Superchip, Nvidia announced that its H200 NVL PCIe card will become available through partners next month. The NVL4 module is a H200 GPU in the SXM form factor designed to fit in Nvidia’s DGX system as well as HGX systems sold by channel partners like Supermicro.

The H200 NVL connects four cards, double the number of its predecessor, the H100 NVL. It also offers the option of liquid cooling; the H100 did not. Instead of using PCIe to communicate, H200 NVL uses NVLink interconnect bridge, which enables a bidirectional throughput of 900 GB/s per GPU, seven times that of PCIe 5.

The H200 NVL is intended for enterprises to accelerate AI and HPC applications, while also improving energy efficiency through reduced power consumption. It has 1.5x more memory and 1.2x more bandwidth over the H100 NVL. For HPC workloads, performance is boosted up to 1.3x over H100 NVL and 2.5x over the NVIDIA Ampere architecture generation.

Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro are all expected to deliver a wide range of configurations supporting H200 NVL. Additionally, H200 NVL will be available in platforms from Aivres, ASRock Rack, ASUS, GIGABYTE, Ingrasys, Inventec, MSI, Pegatron, QCT, Wistron and Wiwynn.

Source:: Network World