AI test bed proposal for DoE a good first step: Analyst

A recommendation by a US Department of Energy (DoE) working group to create an AI test bed that will be mandated with finding ways to create data centers that are more energy efficient than they are today represents a good step forward, but much needs to be done, an analyst said today.

Lucas Beran, a research director with Dell’Oro Group, whose primary focus revolves around power and cooling technologies that enable sustainable data centers, said changes are needed because the “compute that runs AI workloads is very different from general purpose computing based on CPUs. This is significantly changing the trajectory of how you power this infrastructure, how you cool this infrastructure, and just how much energy in general could be consumed from these types of deployment in the future.”

A report submitted to Secretary of Energy Jennifer Granholm in late July by a department advisory board working group tasked with finding alternative ways to create energy stated that “connection requests for hyperscale facilities of 300-1000MW or larger with lead times of one to three years are stretching the capacity of local grids to deliver and supply power at that pace.”

It noted,  “a significant factor today and in the medium-term (2030+) is expanding power demand of AI applications. Advancements in both hardware and software have enabled  development of large language models (LLMs) that now approach human capabilities on a wide range of valuable tasks. As these models have grown larger, so have concerns about sizeable future increases in the energy to deploy LLMs as AI tools become more deeply woven into society.”

The report added that the scale of the potential growth of both the electricity and the information technology sectors due to AI is extraordinary and represents the leading edge of projected electricity demand growth.

Purpose of the testbed

It stated that the AI testbed “can allow researchers from the national labs, academia, and industry to collaborate in development and assessment of algorithms for energy-efficient and/or energy-flexible AI training and inference, advancing the nation’s AI capabilities and building on the success of comparable public-private efforts that have accelerated advances in high-performance computing.”

For immediate impact, authors of the report wrote,  “the Secretary should convene energy utilities, data center developers and operators, and other key stakeholders to start active dialog on how to address current electricity supply bottlenecks,” as well as “to develop strategies for how to generate and deliver the power needed to sustain AI leadership into the future.”

The recommendation to create a test bed, said Beran, “is really step one for the DoE in terms of understanding what infrastructure is being used, and how much energy it consumes. And then, once we have this starting data point, how do we improve from there? This really made me think of the old saying, ‘you can’t improve on what you can’t measure.’ They have to start somewhere and set a base case and that to me, is what this is.”

Developing solutions

He said the hyperscalers, to which the working group reached out to solicit views, face “unsolved problems in how to manage power demands from AI workloads. It is not like the industry has solved the problems or challenges, it is more like, ‘we have identified challenges with the AI workload and energy profile requirements, and now we need to start to develop some solutions for them.’”

Those solutions, said Beran, range from changing how data center facilities are architected to making system changes to accommodate the workload profile.

Of note, he said, is the need to improve the energy efficiency factor. He added that while sustainability has “been such a critical factor the past couple of years, it is really started to take a backseat to some of the AI growth requirements. Trying to manage both is important.”

In addition, Thomas Randall, director of AI market research at Info-Tech Research Group, said via email, “as AI models get larger and require more compute power, the amount of energy required to support this market will equally increase. Without a broader energy strategy, countries that house the data centers AI companies are using will face ongoing CO2 emission issues, limitations on growth, and opportunity costs for energy use elsewhere.”

Source:: Network World