In a reversal of earlier reports, Amazon Web Services (AWS) has clarified that it has not halted orders for Nvidia’s most advanced chip. Instead, AWS and Nvidia are collaborating on a project-specific upgrade, a Financial Times report said.
The Financial Times initially reported that AWS has “halted” its orders for Nvidia’s Grace Hopper chip. But now the report said that AWS has now “fully transitioned” it to the newest Blackwell GPUs, announced in March. Some interpreted this transition as a potential halt in orders for the current generation chips.
The transition applies solely to Project Ceiba, a joint supercomputer initiative between AWS and Nvidia, said the report, quoting an AWS spokesperson.
“To be clear, AWS did not halt any orders from Nvidia,” the Financial Times report said quoting the AWS spokesperson. “In our close collaboration with Nvidia, we jointly decided to move Project Ceiba from Hopper to Blackwell GPUs, which offer a leap forward in performance.”
Project Ceiba refers to an AI supercomputer being co-developed by AWS and Nvidia.
The Financial Times has since updated its story to reflect that Amazon’s chip orders had not yet been placed, aligning with AWS’ clarification.
AWS continues to offer services based on Nvidia’s Hopper chips, which remain crucial for training AI systems. The decision to transition Project to Ceiba to Blackwell chips aligns with Nvidia’s announcement in March, highlighting the superior performance of the new GPUs.
Blackwell promises a performance boost
Nvidia’s new Blackwell chips, unveiled by CEO Jensen Huang in March, are expected to be twice as powerful for training large language models (LLMs) such as OpenAI’s ChatGPT, compared to their predecessors.
The Nvidia GB200 Grace Blackwell Superchip integrates two NVIDIA B200 Tensor Core GPUs with the Nvidia Grace CPU via a 900GBps ultra-low-power NVLink chip-to-chip interconnect.
To achieve the highest AI performance, GB200-powered systems can be paired with the newly announced Nvidia Quantum-X800 InfiniBand and Spectrum-X800 Ethernet platforms, which offer advanced networking capabilities at speeds up to 800Gbps.
“Selecting the flagship Blackwell chips in lieu of less powerful Grace Hoppers from Nvidia makes more sense for advancing its AI training of LLMs, LVMs, and simulation with applications across industries,” said Neil Shah, VP for research and partner at Counterpoint Research. “For AWS, especially with this rapid evolution of the size of the training models, the cloud giant has to be prudent of its investments in getting the best ROI for the advanced compute investments as well as the efficiency of those compute from an energy consumption perspective. With Project Cebia, the goalposts are actually moving and Amazon needs to be at the leading edge to catch up with Google, and Microsoft in this AI race.”
AWS, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure will be among the first cloud service providers to offer Blackwell-powered instances, Nvidia said in its March announcement. Additionally, companies in the Nvidia Cloud Partner program, including Applied Digital, CoreWeave, Crusoe, IBM Cloud, and Lambda, will also provide these advanced instances.
Source:: Network World