Cisco is unveiling a new server and preconfigured designs aimed at helping enterprise customers implement infrastructure that can handle the massive data sets and complex algorithms of AI training and workloads.
On the server side, Cisco announced a more powerful member of its Unified Computing System (UCS) family, the UCS C885A M8. The 8U rack server is built on Nvidia’s HGX platform and designed to deliver the accelerated compute capabilities needed for AI workloads such as large language model (LLM) training, model fine-tuning, large model inferencing, and retrieval-augmented generation (RAG).
The new UCS C885A M8 system incorporates technology spawned from Cisco and Nvidia’s expanded partnership announced earlier this year. The two companies are offering integrated software and hardware packages for customers looking to spin up AI infrastructure.
As part of that announcement, the companies said Nvidia’s Tensor Core GPUs would be available in Cisco’s current M7 UCS rack and blade servers, including Cisco UCS X-Series and UCS X-Series Direct, to support AI and data-intensive workloads in the data center and at the edge. In addition, the companies are offering a turnkey AI package called the Cisco Nexus HyperFabric AI cluster, which includes a Cisco 6000 series switch for spine and leaf implementation supporting 400G and 800G Ethernet fabrics, GPUs, Nvidia BlueField-3 DPUs and SuperNICs as well as AI reference designs.
The UCS C885A M8 can be configured with up to 8 Nvidia high-density H100 and H200 Tensor Core GPUs or AMD MI300X OAM GPUs to accelerate AI networking performance, as well as Nvidia BlueField-3 DPUs to accelerate GPU data access among a cluster of dense GPU servers.
The Nvidia HGX includes a number of networking options – at speeds up to 400 GB – using Nvidia Quantum-2 InfiniBand or Spectrum-X Ethernet, according to Nvidia.
The server is managed by Cisco Intersight, a SaaS-delivered package that can manage a variety of systems from Kubernetes containers to applications, servers, and hyperconverged environments from a single location.
Cisco expects customers will combine the new servers with its recently announced Nexus 9364E-SG2 switch. The high-density 800G aggregation box supports port speeds from 400 to 200 and 100 Gbps and includes support for high-speed optical network connections Open System Form Factor Plus (OSPF) and Quad Small Form Factor Pluggable Double Density (QSPF-DD).
“To train GenAI models, clusters of these powerful servers often work in unison, generating an immense flow of data that necessitates a network fabric capable of handling high bandwidth with minimal latency. This is where the newly released Cisco Nexus 9364E-SG2 Switch shines,” wrote Jeremy Foster and Kevin Wollenweber in a blog post. Foster is senior vice president and general manager of Cisco Compute, and Wollenweber is senior vice president and general manager of Cisco Networking, Data Center and Provider Connectivity.
“Its high-density 800G aggregation ensures smooth data flow between servers, while advanced congestion management and large buffer sizes minimize packet drops—keeping latency low and training performance high. The Nexus 9364E-SG2 serves as a cornerstone for a highly scalable network infrastructure, allowing AI clusters to expand seamlessly as organizational needs grow,” Foster and Wollenweber wrote.
New Cisco AI Pods
Along with the new hardware, Cisco introduced AI Pods, which are preconfigured, validated, and optimized infrastructure packages that customers can plug into their data center or edge environments as needed. The Pods are based on Cisco Validated Design principals, which offer customers pre-tested and validated network designs that provide a blueprint for building reliable, scalable, and secure network infrastructures, according to Cisco.
The Pods include Nvidia AI Enterprise, which features pretrained models and development tools for production-ready AI, and are managed through Cisco Intersight.
“The pre-sized and pre-validated bundles of infrastructure eliminate the guesswork from deploying edge inferencing, large-scale clusters, and other AI inferencing solutions, with more use cases planned for release over the next few months,” Foster and Wollenweber stated.
“Our goal is to enable customers to confidently deploy AI PODs with predictability around performance, scalability, cost, and outcomes, while shortening time to production-ready inferencing with a full stack of infrastructure, software, and AI toolsets.”
Driving the current AI announcements is the desire to address the comprehensive infrastructure requirements that enterprises have to support the AI lifecycle, from building and training sophisticated models to widespread use for inferencing, Foster and Wollenweber stated.
“Many of the CIOs and technology leaders we talk to today recognize this. In fact, most say that their organizations are planning full GenAI adoption within the next two years. Yet according to the Cisco AI Readiness Index, only 14% of organizations report that their infrastructures are ready for AI today. What’s more, a staggering 85% of AI projects stall or are disrupted once they have started,” Foster and Wollenweber stated.
“The reason? There’s a high barrier to entry. It can require an organization to completely overhaul infrastructure to meet the demands of specific AI use cases, build the skillsets needed to develop and support AI, and contend with the additional cost and complexity of securing and managing these new workloads.”
The Cisco UCS C885A M8 can be ordered now and is expected to ship by the end of this year. Cisco AI Pods can be ordered beginning in November 2024.
Source:: Network World