Computex 2024 is taking place in Taiwan this week, which means lots of hardware news as the OEM and parts suppliers of the world gather to show off their latest wares. Nvidia unveiled new Blackwell systems and announced general availability of its Spectrum-X Ethernet stack for AI workloads, for example.
For its part, AMD introduced the Instinct MI325X GPU, which features a faster processor and considerably more high-bandwidth memory (HBM) than its predecessor, the MI300X, or Nvidia’s Blackwell processor. The MI325X has 288 GB of HBM3e memory with 6.0 TBps of memory bandwidth, while the MI300X has 192 GB of HBM3 memory and memory bandwidth of 5.3 TBps.
AMD notes that the highest published results on the NVidia Hopper H200 accelerator resulted in 141GB HBM3e memory capacity and 4.8 TB/s GPU memory bandwidth performance. But that’s Hopper. Blackwell is the new generation coming this fall. A B200 has up to 384GB of HBM memory and up to 16TB of bandwidth.
Still, MI325X is no slouch, with peak theoretical throughput for 8-bit floating point (FP8) and 16-bit floating point (FP16) of 2.6 petaflops and 1.3 petaflops, respectively. That’s 30% higher than what the H200 can accomplish, AMD said.
What’s more, the MI325X will enable servers to handle a 1-trillion-parameter model in its entirety, which is double the size that the H200 can handle, according AMD.
The MI325X uses AMD’s CDNA 3 architecture, which the MI300X also uses. CDNA 3 is based on the gaming graphics card RDNA architecture but is expressly designed for use in data center applications like generative AI and high-performance computing.
In 2025, AMD plans to release the new AMD CDNA 4 architecture and with it the Instinct MI350 series, which AMD says will bring up to a 35x increase in AI inference performance compared to AMD Instinct MI300 Series. And in 2026, the AMD Instinct MI400 series will arrive, based on the AMD CDNA “Next” architecture.
“With our updated annual cadence of products, we are relentless in our pace of innovation, providing the leadership capabilities and performance the AI industry and our customers expect to drive the next evolution of data center AI training and inference,” said Brad McCredie, corporate vice president, data center accelerated compute, AMD, in a statement.
The MI350X will come with 288 GB of HBM3e, like the MI325X. The chip will be manufactured using a 3-nanometer process – a notable shrink in transistor size from 6nm nodes used for MI300 chips – and add support for FP4 and FP6 floating point data formats. The FP6 is new and unique to AMD. Intel and Nvidia have embraced FP4 and FP8 but not 6-bit.
The Instinct line may be running a distant second to the Nvidia Hopper line of GPUs, but it is gaining ground with key partners. AMD noted demand from Microsoft for its Azure OpenAI services, Dell Technologies for enterprise AI workloads, Supermicro with multiple solutions using AMD Instinct accelerators, Lenovo offering Instinct in its ThinkSystem, and HPE using them in the HPE Cray servers.
Source:: Network World