IBM is expanding its AI accelerator options for enterprise users of its cloud service. AMD Instinct MI300X accelerators are now available as a service on IBM Cloud, the vendors announced.
With 192GB of high-bandwidth memory, the MI300X accelerators are equipped for large AI model inferencing and fine tuning. Their large memory capacity can help customers run larger models with fewer GPUs, potentially lowering costs for inferencing, according to AMD.
To help optimize performance for enterprise customers running AI applications, the partnership calls for AMD Instinct MI300X accelerators to be available as a service on IBM Cloud Virtual Servers for VPC, as well as through container support with IBM Cloud Kubernetes Service and IBM Red Hat OpenShift on IBM Cloud.
For generative AI inferencing workloads, IBM said it plans to offer support for AMD Instinct MI300X accelerators within IBM’s watsonx AI and data platform, providing watsonx customers with additional AI infrastructure resources for scaling their AI workloads across hybrid cloud environments. In addition, Red Hat Enterprise Linux AI and Red Hat OpenShift AI platforms can run Granite family LLMs with alignment tooling using InstructLab on MI300X accelerators, according to IBM.
“As enterprises continue adopting larger AI models and datasets, it is critical that the accelerators within the system can process compute-intensive workloads with high performance and flexibility to scale,” said Philip Guido, executive vice president and chief commercial officer, AMD, in a statement.
IBM Cloud with AMD Instinct MI300X accelerators are expected to be generally available in the first half of 2025.
Existing support for Nvidia and Intel chips
The AMD partnership is just the latest effort by IBM to expand its AI accelerator options for cloud users.
In October, it announced that IBM Cloud users could access Nvidia H100 Tensor Core GPU instances in virtual private cloud and managed Red Hat OpenShift environments. The H100 Tensor Core GPUs were added to a family of Nvidia GPUs and software that IBM already supported. At the time, IBM said the Nvidia H100 Tensor Core GPU could enable up to 30X faster inference performance over the A100 Tensor Core.
In August, IBM and Intel said they would deploy Intel Gaudi 3 AI accelerators as a service on IBM Cloud. Gaudi 3, integrated with 5th Gen Xeon, supports enterprise AI workloads in the cloud and in data centers, providing customers with visibility and control over their software stack, simplifying workload and application management, Intel stated. The Intel Gaudi 3 offering, which is expected to be available in early 2025, is aimed at helping IBM Cloud users to more cost effectively scale enterprise AI application development.
Source:: Network World