New tweak to Linux kernel could cut data center power usage by up to 30%

An improvement to the way Linux handles network traffic, developed by researchers at Canada’s University of Waterloo, could make data center applications run more efficiently and save energy at the same time.

Waterloo professor Martin Karsten and Joe Damato, distinguished engineer at Fastly, developed the code — approximately 30 lines. It’s based on research described in a 2023 paper, written by Karsten and grad student Peter Cai, that investigated kernel versus user-level networking and determined that a small change could not only increase application efficiency, but cut data center power usage by up to 30%.

The new code was accepted and added to version 6.13 of the Linux kernel. It adds a new NAPI configuration parameter, irq_suspend_timeout, to help balance CPU usage and network processing efficiency when using IRQ deferral and napi busy poll. This allows it to automatically switch between two modes of delivering data to an application — polling, and interrupt-driven — depending on network traffic, to maximize efficiency.

In polling mode, the application requests data, processes it, and then requests more, in a continuous cycle. In interrupt-driven mode, the application sleeps, saving energy and resources, until network traffic for it arrives, then wakes up and processes it.

“If you have an old-school multi-user, multi-process server with lots of (smallish) applications running concurrently, our new mechanism won’t do anything, but also shouldn’t hurt,” Karsten explained in an email to Network World.

However, he added, “in many data center scenarios, server machines run a small number of dedicated server applications. These applications ‘dominate’ a set of cores and can usually be connected to a set of transmission queues in the NIC [network interface card]. Our mechanism helps with this type of application, if they are also dealing with lots of network traffic. This is true for pretty much all front-end servers, but also many back-end servers delivering data to front-ends.”

When network traffic is heavy, it is most efficient, and delivers the best performance, to disable interrupts and run in polling mode. But when network traffic is light, interrupt-driven processing works best, he noted.

“An implementation using only polling would waste a lot of resources/energy during times of light traffic. An implementation using only interrupts becomes inefficient during times of heavy traffic. … So the biggest energy savings arise when comparing to a high-performance always-polling implementation during times of light traffic,” Karsten said. “Our mechanism automatically detects [the amount of network traffic] and switches between polling and interrupt-driven to get the best of both worlds.”

In the patch cover letter, Damato described the implementation of the new parameter in more detail, noting: “this delivery mode is efficient, because it avoids softIRQ execution interfering with application processing during busy periods. It can be used with blocking epoll_wait to conserve CPU cycles during idle periods. The effect of alternating between busy and idle periods is that performance (throughput and latency) is very close to full busy polling, while CPU utilization is lower and very close to interrupt mitigation.”

Added Karsten: “At the nuts and bolts level, enabling the feature requires a small tweak to applications and the setting of a system configuration variable.”

And although he can’t yet quantify the energy benefits of the technique (the 30% saving cited is best case), he said, “the biggest energy savings arise when comparing to a high-performance always-polling implementation during times of light traffic.”

Source:: Network World