As global AI energy usage mounts, Nvidia claims efficiency gains of up to 100,000X

As concerns over AI energy consumption ratchet up, chip maker Nvidia is defending what it calls a steadfast commitment to sustainability.

At its AI Summit DC this week, the company will report that its GPUs have experienced a 2,000X reduction in energy use over the last 10 years in training and a 100,000X energy reduction over that same time in generating tokens. Further, it said that its platform has seen a 2,000X improvement in efficiency and a 4,000X improvement in computation performance in the last decade.

“If cars had improved their efficiency as much as we have improved that inference performance, you could drive over 300 years on a single tank of gas,” Bob Pette, Nvidia’s VP of enterprise platforms, said during a pre-brief on Monday. “At the core of accelerated computing is sustainable computing.”

Big tech commitments are ‘promising,’ but may not be enough

There are many statistics floating around about AI and data center energy use, further stoking fears across enterprises, regulatory bodies, and everywhere in between.

According to one estimate, AI queries require about ten times the electricity that Google queries consume — 2.9 watt-hours, compared to 0.3 watt-hours. Multiply this by the fact that ChatGPT hit more than 100 million weekly users not long after its launch in November 2022 (and that’s just one AI platform out of hundreds), and you have a huge burden on the power grid.

And it will get worse. Goldman Sachs estimates that data center power demand will grow 160% by 2030 — up from 1% to 2% of overall worldwide usage to 3% to 4%. While these might look like small numbers, the investment banking company says increased demand will drive unprecedented electricity usage growth not seen in a generation.

“It’s absolutely true that innovation can be in direct conflict with company goals or mandates to reduce carbon footprint and emissions,” said Ken Ringdahl, CTO at software company Emburse. “Deep learning and large language models are a transformative technology, but they demand significant resources which implies an increased power draw and resultant emissions.”

Major enterprises are making commitments to reduce emissions and improve sustainability. Microsoft, for one, is set to reopen the notorious Three Mile Island nuclear plant and has committed to become carbon neutral by 2030. Google, meanwhile, has been making progress with geothermal energy in Nevada.

“The deals that tech companies are making to put their own money up to achieve environmental goals are exciting,” said Josh Smith, energy policy lead at the nonprofit Abundance Institute.

However, we should be “dismayed” at the pace of these developments, he pointed out. For instance, Pennsylvania’s governor had to issue a special plea to fast-track Microsoft’s application, and Google’s deal won’t go through until likely the first or second quarter of 2025 due to the “complicated and long-winded regulatory process.”

He and other experts call for action at the federal level, contending that the US environmental protection system has effectively killed new projects and requires updates to encourage clean energy generation. Energy shortages are “self-inflicted” due to projects being stalled by red tape, said Smith.

“The fundamental misunderstanding in many conversations is that there’s a conflict between growth and green goals,” he pointed out. “The opposite is true — economic growth, technological progress, and environmental progress can go hand in hand.”

Ringdahl, for his part, suggested a few strategies: Pick the right model (sometimes a mini model requiring significantly fewer resources will do just fine), compress the model, and be judicious in the use of AI.

“Generative AI is such a powerful utility in that it can be used in so many different ways and provide good results,” he said. “The question always is whether it’s the right tool for the job.”

Cuda ‘dramatically’ improving energy efficiency

Nvidia “pioneered” accelerated computing and is addressing “problems that traditional computing couldn’t solve,” said Pette.

Now, with what he called the big bang of AI, with the advent of ChatGPT and other generative AI tools, “we’re really in the midst of the greatest computing transformation since IBM introduced the CPU back in 1964,” he said.

Still, CPU performance scaling has “slowed tremendously,” and Moore’s Law — the observation that the number of transistors on a circuit will double every two years with little cost rises — “has come to an end,” said Pette.

“It’s no longer possible to scale the CPU at the same rate it was in the past,” he said. Along with this, though, the amount of computational work is growing exponentially and, he said, Nvidia’s accelerated computing is helping to address this.

This all begins with Cuda libraries, which the company developed in 2006. Today, it has a “massive” Cuda catalog — 900 libraries and AI models and 4,000 applications used by more than five million developers — that Pette claimed have helped make the big bang of AI possible.

These “accelerate every industry and dramatically improve energy efficiency,” he said.

He pointed out that, if algorithms are accelerated by 100X, the expectation is that energy requirements will also increase by 100X, but in fact, power consumption is only increasing by 3X.

“That’s only three times as large, so you can imagine the net positive impact to our energy grid,” said Pette.

He pointed out that Nvidia’s Blackwell GPU was built with energy efficiency in mind, while kicking up AI performance “in a significant way.” Integrated with liquid cooling, he said, Blackwell GPUs have become 100,000X more energy efficient for inference and 2,000X more efficient for training than Nvidia chips from a decade ago.

“It will be the platform that drives a new generation of energy efficient data centers,” said Pette.

Source:: Network World