Dell updates servers with AMD AI accelerators

Dell is now shipping PowerEdge servers loaded with AMD’s latest Instinct GPU accelerator and offering deployment-support services and software for rapidly building generative AI applications.

The Dell PowerEdge XE9680 is available with AMD Instinct MI300X accelerators, the top-of-the-line from AMD that competes with Nvidia’s Hopper generation of processors. The PowerEdge server is designed for enterprises leveraging generative AI and features up to eight MI300X accelerators, with a combined 1.5 TB of HBM3 memory and 42 petaFLOPS of peak theoretical FP8 with sparsity precision performance.

This configuration is designed for faster training and inferencing of large language models (LLM). As part of testing, Dell deployed a 70 billion parameter Llama 2 model on a server with a single MI300X accelerator. It also fine-tuned that same model with FP16 precision on one Dell PowerEdge XE9680 Server with eight AMD Instinct MI300X accelerators.

The PowerEdge XE9680 includes Dell’s OpenManage Enterprise centralized server management software fast, simple deployment, APEX AIOps automation software, and integrated cyber recovery and zero-trust security features.

Dell validated designs aim to streamline genAI deployments

Also new from Dell is its Validated Design for Generative AI with AMD, a standard framework for organizations that run their own LLMs. Announced in May and available today, the validated designs are aimed at making it easier for enterprises to deploy systems for LLM inferencing and model customization.

“This design guidance gives organizations and developers comprehensive directions to implement LLM inferencing and model customization, as well as advanced techniques like fine-tuning and retrieval augmented generation (RAG). Built on open standards and reducing the need for proprietary AI software suites, developers can simplify development and freely customize workflows with open-source LLM models from partners including Hugging Face and Meta,” wrote Luke Mahon, director of the Dell AI solutions technical marketing engineering team, in a blog post about the new products and services.

On the software side, Dell is providing AMD ROCm-powered frameworks to provide support for open-source LLMs like PyTorch, TensorFlow, ONNX-RT and JAX, as well as the full stack of drivers, dev toolkits and APIs for AMD Instinct accelerators.

In addition, “Dell Omnia streamlines the creation and management of AI clusters automating configuration for efficient workload processing,” Mahon wrote. (Omnia is an open-source toolkit for deploying and managing high-performance clusters for HPC, AI, and data analytics workloads.)

An enterprise SONiC distribution by Dell Technologies combines the open-source SONiC platform with Dell PowerSwitch to deliver a scalable networking solution that offers advanced features and enterprise-grade support, according to Mahon.

To help customers with their AI initiatives, Dell unveiled platform implementation services in September. Consultants will help customers establish a platform for building and deploying AI tools and frameworks. Dell will also offer half-day Accelerator Workshops to help companies determine how they can derive maximize value from AI.

Source:: Network World