Nvidia announces Blackwell Ultra, its next-generation GPU

Nvidia on Wednesday introduced its next-generation GPU called Blackwell Ultra, and also announced new systems based on the chipset.

The Blackwell Ultra GPU is in production and will succeed the current generation GPU called Blackwell, said Ian Buck, vice president of hyperscale and high-performance computing, in a press briefing call.

The new GPU was announced at Nvidia’s GTC conference, which is being held this week in San Jose, California.

Nvidia representatives didn’t share Blackwell Ultra’s shipment date but said systems with the GPU will be available later this year.

The base architectural design of Blackwell Ultra is similar to Nvidia’s Blackwell, but it provides incremental performance improvements with increased memory capacity and AI tweaks in the silicon, said Anshel Sag, vice president and principal analyst at Moor Insights and Strategy.

“It means potentially fewer GPUs for the same tasks and possibly being able to run larger models on a single GPU at the most fundamental level,” Sag said.

Designed for the age of reasoning

The Blackwell Ultra GPU is designed for the “age of reasoning,” Buck said, referencing the DeepSeek model, which is able to deliver better results with additional reasoning compared to earlier knowledge-based models that only delivered results.

“Reasoning models operate differently. They’re asked a question, a complex one, and actually don’t answer the question right away, but instead produce literally thousands or 10,000 thinking tokens before coming up with an answer,” Buck said.

Buck said Blackwell Ultra was the first GPU with 288GB of HBM3e memory. The total memory capacity of the current-generation Blackwell GPU is 192GB of HBM3e memory.

The new GPU is also 1.5 times faster than Blackwell in FP4 inferencing. The FP4 data type is a low-precision measure for inferencing as it delivers faster AI responses while requiring less power and memory.

Nvidia’s GPUs support data types ranging from FP4 to FP64. FP64 provides more accurate responses but also consumes more power and time. Nvidia didn’t provide benchmarks on other data types.

Sag said that the FP4 data type shows Blackwell Ultra in its best light, but it doesn’t paint the whole picture of the chip’s performance.

“Nvidia really wants FP4 to take off because it’s computationally very advantageous for the company,” Sag said.

Blackwell Ultra data center and desktop systems announced

The company also announced new data center and desktop systems with Blackwell Ultra and Blackwell GPUs.

The GB300 NVL72 server system combines 72 Blackwell Ultra GPUs and 36 homegrown Grace CPUs. It’s an upgrade from the NVL72 server based on the current Blackwell generation, which started shipping last year.

“We’ve upgraded the NVL72 design for improved energy efficiency and serviceability,” Buck said.

The GB300 NVL72 rack will offer 1.1 exaflops of FP4 inference performance and 20 terabytes of HBM3e memory. The predecessor system, GB200 NVL72, had 13.5 terabytes of HBM3e memory.

Buck compared the 671-billion parameter DeepSeek-R1 reasoning model on GB300 NVL72 to a comparable system with H100 GPUs, which is based on the Hopper architecture. Hopper systems can generate 100 tokens per second and generate answers in 1.5 minutes, Buck said.

“With GB300 NVL72 you can 10x the token rate and provide that question’s answer in only 10 seconds,” Buck said.

Nvidia also announced the DGX SuperPod with DGX B300 systems, which networks multiple DGX B300 systems in a cluster of 576 Blackwell Ultra GPUs and 300 Grace CPUs.

Dell announced it would support Nvidia’s Blackwell Ultra GPUs in its servers, including the GB300 NVL72.

“We expect that the [GPU] performance will be double the previous generations of Nvidia accelerators,” said Varun Chhabra, senior vice president of marketing for the infrastructure products group and telecom, in a separate press briefing.

Nvidia also announced workstations, desktops, and laptops with Blackwell Ultra and Blackwell.

Nvidia’s DGX Station workstation desktop AI system includes the GB300 super chip, which combines the Grace CPU and Blackwell Ultra GPU. It provides 20 petaflops of AI performance and 784GB of unified system memory, Buck said. Asus, Boxx, Dell, HP, Lambda, and Supermicro will ship systems later this year.

Nvidia’s Buck claimed its DGX Spark mini-desktop, which was previously dubbed ‘Project Digits,’ was “the world’s smallest AI supercomputer.”

The system includes the GB10 super chip, which combines the Grace CPU with the Blackwell GPU, and comes with 128GB of unified memory. Dell, HP, Lenovo and other companies will launch branded versions of DGX Spark later this year, Buck said.

The GPU maker also announced that PC makers would ship laptops and desktops with its RTX Pro GPUs.

Source:: Network World