New middleware doubles GPU computational efficiency for AI workloads in trials, says Fujitsu

New middleware from Fujitsu has achieved more than a 2x increase in GPU computational efficiency for artificial intelligence (AI) workloads in trials according to the company, which designed the technology specifically to help solve the issue of GPU limitations and shortages related to computing demands of AI.

The goal of the middleware, which was released today to customers globally, is to improve resource allocation and memory management across various platforms and applications that use AI, according to a press release. Fujitsu already has been piloting it with various partners, with more technology trials planned to begin this month.

Fujitsu began testing its new middleware with AWL, Xtreme-D, and Morgenrot in May, with results demonstrating up to a 2.25x increase in computational efficiency when running AI workloads, the company said. The partners also saw a substantial increase in the number of concurrently handled AI processes across diverse cloud environments and servers when using the middleware.

“By enabling GPU sharing between multiple jobs, we achieved a remarkable near 10% reduction in overall execution time compared to running jobs sequentially on two GPUs,” observed Hisashi Ito, CTO of Morgenrot, in a press statement. “This parallel processing capability [allowed for] simultaneous execution of long training sessions for model building and shorter inference/testing tasks, all within constrained resources.”

This month, Tradom also will begin trials using the new product, while Sakura Internet is in the process of a conducting a feasibility study on the use of the technology for its data center operations, according to Fujitsu.

AI processing optimization

GPUs are better suited to AI processing than CPUs, and thus their use had been increasing dramatically. However, this also has driven up data center power consumption considerably and created a shortage of GPUs, and companies are looking for alternative solutions to optimize AI workloads.

“The rapid expansion of compute infrastructure to support training for genAI has created a major electrical power availability challenge,” said a Gartner research note on emerging technologies for energy-efficient generative AI compute systems by researchers Gaurav Gupta, Menglin Cao, Alan Priestley, Akhil Singh, and Joseph Unsworth.

This means those running AI data centers must find solutions to the problem now to mitigate the challenges for their operations, which include increased costs, insufficient power availability, and poorer sustainability performance.  “All of these will be eventually passed on to data center operators’ customers and end users,” the researchers noted.

At the same time, data centers must balance the bottlenecks in performance that the drive to GPU-assisted AI is causing, noted Eckhardt Fischer, senior research analyst for IDC. “Any improvement in the computer system to reduce this bottleneck will generally show a corresponding improvement in output,” he observed.

These bottlenecks for AI/genAI compute requirements include memory and networking, because “even the current Moore’s Law can’t keep up with explosive compute needs,” noted Gartner’s Gupta.

Optimizing resource allocation

Fujitsu’s AI computing broker middleware aims to solve this in part using a combination of adaptive GPU allocator technology developed by the company in November 2023, and AI-processing optimization technologies, the company said. This allows the middleware to automatically identify and optimize CPU and GPU resource allocation for AI processing in multiple programs, giving priority to processes with high execution efficiency.

However, rather than conventional resource allocation, which does the task on a per-job basis, Fujitsu’s AI computing broker dynamically allocates resources on a per-GPU basis, the company said. This is aimed at improving availability rates and allowing for the concurrent running of numerous AI processes without being concerned with GPU memory usage or physical capacity.

The concept behind the middleware makes sense, Gupta noted, as power consumption of GPUs “is a big concern and hence energy efficiency comes into picture.”

“This doesn’t solve the shortage problem, but improves utilization and hence efficiency of operations – so, in a way, you can do more with less – as long as technology works,” he said, which, as it’s early days, remains to be seen.

However, Gupta added, if Fujitsu’s AI-focused middleware can lead to any improvement in memory and GPU utilization, it is worth following, and observing its adoption and the future competitive landscape for similar solutions.

Source:: Network World