Why eBPF is critical and how it’s getting better

The open-source eBPF (extended Berkeley Packet Filter) technology has become one of the most critical foundational elements of networking with Linux over the last decade. Soon that same power will reach out to embrace Microsoft Windows, too.

At the eBPF Summit on Sept. 11, users and developers detailed how they are working with eBPF today and where the technology is headed in the future. The open-source eBFP technology enables users to run code safely in the Linux kernel. It’s used to help enable network packet visibility as well as numerous security capabilities. Linux first integrated eBPF in 2014, and the technology and its capabilities have grown over the past decade.

“eBPF is amazing at enabling rapid innovation for infrastructure and for tooling at the operating system level,” Thomas Graf, co-founder and CTO of Isovalent, said during his keynote.

Isovalent was acquired by Cisco earlier this year. The company is one of the leading contributors to eBPF and runs the open-source Cilium networking project, which is based on eBPF.

How Netflix uses eBPF so we can all chill

Streaming media giant Netflix was among the noteworthy users and deployments discussed at the summit. In a keynote address, Shweta Saraf, director of platform networking at Netflix, detailed how the networking world has improved over the last decade with eBPF and how the streaming media company uses the open-source technology.

Saraf noted that eBPF enabled reduced kernel patching and debugging cycles. It also makes performance monitoring and troubleshooting more efficient:

“In a cloud environment, we just assume that it’s a given that we have tools that allow us to debug performance on a daily basis,” she said. “But without eBPF, we would have to rely on good old tools like TCPdump and strace, and in turn, those would require a lot more system resources, they would be highly inefficient, leading us to investing a lot of dollars in monitoring the fleet at a high scale in a cloud environment.”

Netflix is both a leading contributor and user of eBPF. It also has built out multiple networking tools on its own that use eBPF. Netflix built a network observability sidecar called Flow Exporter, for example. A sidecar is a term used to describe a type of container that operates alongside a cluster. The Flow Exporter uses eBPF to collect and process data.

“We collect all of this data and also use it for traffic forecasting and run it through a large ML (machine learning) model, which in turns, allows us to do interesting things like traffic shaping and dynamically addressing traffic,” she said.

The challenge of dealing with noisy neighbors is familiar to many networking professionals. Netflix is using eBPF to detect noisy neighbor problems, which can take a toll on application performance if not detected and remediated.

Netflix has also developed a tool called bpftop, which provides a real-time view of eBPF programs and shows stats like average execution, runtime events per seconds and CPU utilization.

Security is another area where Netflix is making use of eBPF as part of its Dropio DDoS (distributed denial of service) mitigation tool. “We invested in building this eBPF-based module that is actually highly efficient in enforcing IP-based rules,” she said.

Where is eBPF headed? Look at Windows

Ever since it was created, eBPF has been an open-source technology that’s only available on Linux. That’s going to change soon, according to Graf.

“eBPF is coming to Windows,” Graf said.

As part of the effort to bring eBPF to Windows, he noted a key goal is making sure existing eBPF-based tools are compatible. While the eBPF bytecode language and concepts like verification and just-in-time compilation will be the same, the actual hook points where eBPF programs can attach may be slightly different depending on the operating system, Graf explained.

There are also efforts underway to make sure that eBPF is optimized to understand traffic going to GPUs (graphics processing unit) and DPUs (data processing unit). Both GPUs and DPUs have become increasingly important as AI workloads grow.

Another item on the eBPF roadmap is the concept of using eBPF to enable distributed intelligence. Graf explained that, to date, for any sort of intelligent behavior, analytics or machine learning use case, the typical architecture requires a lot of data to be streamed to a database. The challenge with that approach is that eventually it can’t scale because there is just far too much data and associated storage requirements.

“We’re now at the point where the amount of intelligence that can be applied is limited by your budget for data transfer and storage requirements,” Graf said.

The concept of eBPF distributed intelligence is to turn around the current approach: “Instead of bringing the data to the intelligence, we bring the intelligence to the data,” Graf said. “So think of eBPF as giving us the opportunity to not just have very deep observability, but to also essentially extract the observability we need and then apply logic to make intelligent decisions and react to that in ways that don’t require us to stream data somewhere else.”

Source:: Network World