flowtrackd: DDoS Protection with Unidirectional TCP Flow Tracking
Magic Transit is Cloudflare’s L3 DDoS Scrubbing service for protecting network infrastructure. As part of our ongoing investment in Magic Transit and our DDoS protection capabilities, we’re excited to talk about a new piece of software helping to protect Magic Transit customers: flowtrackd. flowrackd is a software-defined DDoS protection system that significantly improves our ability to automatically detect and mitigate even the most complex TCP-based DDoS attacks. If you are a Magic Transit customer, this feature will be enabled by default at no additional cost on July 29, 2020.
TCP-Based DDoS Attacks
In the first quarter of 2020, one out of every two L3/4 DDoS attacks Cloudflare mitigated was an ACK Flood, and over 66% of all L3/4 attacks were TCP based. Most types of DDoS attacks can be mitigated by finding unique characteristics that are present in all attack packets and using that to distinguish ‘good’ packets from the ‘bad’ ones. This is called “stateless” mitigation, because any packet that has these unique characteristics can simply be dropped without remembering any information (or “state”) about the other packets that came before it. However, when attack packets have no unique characteristics, then “stateful” mitigation is required, because whether a certain packet is good or bad depends on the other packets that have come before it.
The most sophisticated types of TCP flood require stateful mitigation, where every TCP connection must be tracked in order to know whether any particular TCP packet is part of an active connection. That kind of mitigation is called “flow tracking”, and it is typically implemented in Linux by the iptables conntrack module. However, DDoS protection with conntrack is not as simple as flipping the iptable switch, especially at the scale and complexity that Cloudflare operates in. If you’re interested to learn more, in this blog we talk about the technical challenges of implementing iptables conntrack.
Complex TCP DDoS attacks pose a threat as they can be harder to detect and mitigate. They therefore have the potential to cause service degradation, outages and increased false positives with inaccurate mitigation rules. So how does Cloudflare block patternless DDoS attacks without affecting legitimate traffic?
Bidirectional TCP Flow Tracking
Using Cloudflare’s traditional products, HTTP applications can be protected by the WAF service, and TCP/UDP applications can be protected by Spectrum. These services are “reverse proxies”, meaning that traffic passes through Cloudflare in both directions. In this bidirectional topology, we see the entire TCP flow (i.e., segments sent by both the client and the server) and can therefore track the state of the underlying TCP connection. This way, we know if a TCP packet belongs to an existing flow or if it is an “out of state” TCP packet. Out of state TCP packets look just like regular TCP packets, but they don’t belong to any real connection between a client and a server. These packets are most likely part of an attack and are therefore dropped.
Reverse Proxy: What Cloudflare Sees
While not trivial, tracking TCP flows can be done when we serve as a proxy between the client and server, allowing us to absorb and mitigate out of state TCP floods. However it becomes much more challenging when we only see half of the connection: the ingress flow. This visibility into ingress but not egress flows is the default deployment method for Cloudflare’s Magic Transit service, so we had our work cut out for us in identifying out of state packets.
The Challenge With Unidirectional TCP Flows
With Magic Transit, Cloudflare receives inbound internet traffic on behalf of the customer, scrubs DDoS attacks, and routes the clean traffic to the customer’s data center over a tunnel. The data center then responds directly to the eyeball client using a technique known as Direct Server Return (DSR).
Magic Transit: Asymmetric L3 Routing
Using DSR, when a TCP handshake is initiated by an eyeball client, it sends a SYN packet that gets routed via Cloudflare to the origin data center. The origin then responds with a SYN-ACK directly to the client, bypassing Cloudflare. Finally, the client responds with an ACK that once again routes to the origin via Cloudflare and the connection is then considered established.
L3 Routing: What Cloudflare Sees
In a unidirectional flow we don’t see the SYN+ACK sent from the origin to the eyeball client, and therefore can’t utilize our existing flow tracking capabilities to identify out of state packets.
Unidirectional TCP Flow Tracking
To overcome the challenges of unidirectional flows, we recently completed the development and rollout of a new system, codenamed flowtrackd (“flow tracking daemon”). flowtrackd is a state machine that hooks into the network interface. It tracks unidirectional TCP flows using only the ingress traffic that routes through Cloudflare to determine the state of the TCP connection. flowtrackd is then able to determine if a packet is part of a new connection, an open one, a connection that is closing, one that is closed, or if it’s an out of state packet. Once a high volume of out-of-state packets is detected, flowtrackd will either challenge (force RST) or drop the packets.
Snapshot from what flowtrackd sees
The state machine that determines the state of the flows was developed in-house and complements Gatebot and dosd. Together Gatebot, dosd, and flowtrackd provide a comprehensive multi layer DDoS protection.
Releasing flowtrackd to the Wild
And it works! Less than a day after releasing flowtrackd to an early access customer, flowtrackd automatically detected and mitigated an ACK flood that peaked at 6 million packets per second. No downtime, service disruption, or false positives were reported.
flowtrackd Mitigates 6M pps Flood
Cloudflare’s DDoS Protection – Delivered From Every Data Center
As opposed to legacy scrubbing center providers with limited network infrastructures, Cloudflare provides DDoS Protection from every one of our data centers in over 200 locations around the world. We write our own software-defined DDoS protection systems. Notice I say systems, because as opposed to vendors that use a dedicated third party appliance, we’re able to write and spin up whatever software we need, deploy it in the optimal location in our tech stack and are therefore not dependent on other vendors or be limited to the capabilities of one appliance.
flowtrackd joins the Cloudflare DDoS protection family which includes our veteran Gatebot and the younger and energetic dosd. flowtrackd will be available from every one of our data centers, with a total mitigation capacity of over 37 Tbps, protecting our Magic Transit customers against the most complex TCP DDoS attacks.
New to Magic Transit? Replace your legacy provider with Magic Transit and pay nothing until your current contract expires. Offer expires September 1, 2020. Click here for details.