Back in 2012, we introduced Page Rules, a pioneering feature that gave Cloudflare users unprecedented control over how their web traffic was managed. At the time, this was a significant leap forward, enabling users to define patterns for specific URLs and adjust Cloudflare features on a page-by-page basis. The ability to apply such precise configurations through a simple, user-friendly interface was a major advancement, establishing Page Rules as a cornerstone of our platform.
Page Rules allowed users to implement a variety of actions, including redirects, which automatically send visitors from one URL to another. Redirects are crucial for maintaining a seamless user experience on the Internet, whether it’s guiding users from outdated links to new content or managing traffic during site migrations.
As the Internet has evolved, so too have the needs of our users. The demand for greater flexibility, higher performance, and more advanced capabilities led to the development of the Ruleset Engine, a powerful framework designed to handle complex rule evaluations with unmatched speed and precision.
In September 2022, we announced and released Single Redirects as a modern replacement for the URL Forwarding feature of Page Rules. Built on top of the Ruleset Engine, this new product offered a powerful syntax and enhanced performance.
Despite the enhancements, one of the most consistent pieces of feedback from our users was the need for wildcard matching and expansion, also known as globbing. This feature is essential for creating dynamic and flexible URL patterns, allowing users to manage a broader range of scenarios with ease.
Today we are excited to announce that wildcard support is now available across our Ruleset Engine-based products, including Cache Rules, Compression Rules, Configuration Rules, Custom Errors, Origin Rules, Redirect Rules, Snippets, Transform Rules, Web Application Firewall (WAF), Waiting Room, and more.
Understanding wildcards
Wildcard pattern matching allows users to employ an asterisk `(*)` in a string to match certain patterns. For example, a single pattern like `https://example.com/*/t*st` can cover multiple URLs such as `https://example.com/en/test`, `https://example.com/images/toast`, and `https://example.com/blog/trust`.
Once a segment is captured, it can be used in another expression by referencing the matched wildcard with the `${}` syntax, where “ indicates the index of a matched pattern. This is particularly useful in URL forwarding. For instance, the URL pattern `https://example.com/*/t*st` can redirect to `https://${1}.example.com/t${2}st`, allowing dynamic and flexible URL redirection. This setup ensures that `https://example.com/uk/test` is forwarded to `https://uk.example.com/test`, `https://example.com/images/toast` to `https://images.example.com/toast`, and so on.
Challenges with Single Redirects
In Page Rules, redirecting from an old URI path to a new one looked like this:
Source URL: `https://example.com/old-path/*`
Target URL: `https://example.com/new-path/$1`
In comparison, replicating this behaviour in Single Redirects without wildcards required a more complex approach:
Filter: `(http.host eq “example.com” and starts_with(http.request.uri.path, “/old-path/”))`
Expression: `concat(“/new-path/”, substring(http.request.uri.path, 10)) (where 10 is the length of /old-path/)`
This complexity created unnecessary overhead and difficulty, especially for users without access to regular expressions (regex) or the technical expertise to come up with expressions that use nested functions.
Wildcard support in Ruleset Engine
With the introduction of wildcard support across our Ruleset Engine-based products, users can now take advantage of the power and flexibility of the Ruleset Engine through simpler and more intuitive configurations. This enhancement ensures high performance while making it easier to create dynamic and flexible URL patterns and beyond.
What’s new?1) Operators “wildcard” and “strict wildcard” in Ruleset Engine:
“wildcard” (case insensitive): Matches patterns regardless of case (e.g., “test” and “TesT” are treated the same, similar to Page Rules).
“strict wildcard” (case sensitive): Matches patterns exactly, respecting case differences (e.g., “test” won’t match “TesT”).
Both operators can be applied to any string field available in the Ruleset Engine, including full URI, host, headers, cookies, user-agent, country, and more.
This example demonstrates the use of the “wildcard” operator in a Web Application Firewall (WAF) rule applied to the User Agent field. This rule matches any incoming request where the User Agent string contains patterns starting with “Mozilla/” and includes specific elements like “Macintosh; Intel Mac OS “, “Gecko/”, and “Firefox/”. Importantly, the wildcard operator is case insensitive, so it captures variations like “mozilla” and “Mozilla” without requiring exact matches.
2) Function `wildcard_replace()` in Single Redirects:
In Single Redirects, the `wildcard_replace()` function allows you to use matched segments in redirect URL targets.
Consider the URL pattern `https://example.com/*/t*st` mentioned earlier. Using `wildcard_replace()`, you can now set the target URL to `https://${1}.example.com/t${2}st` and dynamically redirect URLs like `https://example.com/uk/test` to `https://uk.example.com/test` and `https://example.com/images/toast` to `https://images.example.com/toast`.
3) Simplified UI in Single Redirects:
We understand that not everyone wants to use advanced Ruleset Engine functions, especially for simple URL patterns. That’s why we’ve introduced an easy and intuitive UI for Single Redirects called “wildcard pattern”. This new interface, available under the Rules > Redirect Rules tab of the zone dashboard, lets you specify request and target URL wildcard patterns in seconds without needing to delve into complex functions, much like Page Rules.
How we built it
The Ruleset Engine powering Cloudflare Rules products is written in Rust. When adding wildcard support, we first explored existing Rust crates for wildcard matching.
We considered using the popular `regex` crate, known for its robustness. However, it requires converting wildcard patterns into regular expressions (e.g., `*` to `.*,` and `?` to `.`) and escaping other characters that are special in regex patterns, which adds complexity.
We also looked at the `wildmatch` crate, which is designed specifically for wildcard matching and has a couple of advantages over `regex`. The most obvious one is that there is no need to convert wildcard patterns to regular expressions. More importantly, wildmatch can handle complex patterns efficiently: wildcard matching takes quadratic time – in the worst case the time is proportional to the length of the pattern multiplied by the length of the input string. To be more specific, the time complexity is O(p + ℓ + s ⋅ ℓ), where p is the length of the wildcard pattern, ℓ the length of the input string, and s the number of asterisk metacharacters in the pattern. (If you are not familiar with big O notation, it is a way to express how an algorithm consumes a resource, in this case time, as the input size changes.) In the Ruleset Engine, we limit the number of asterisk metacharacters in the pattern to a maximum of 8. This ensures we will have good performance and limits the impact of a bad actor trying to consume too much CPU time by targeting extremely complicated patterns and input strings.
Unfortunately, `wildmatch` did not meet all our requirements. Ruleset Engine uses byte-oriented matching, and `wildmatch` works only on UTF-8 strings. We also have to support escape sequences – for example, you should be able to represent a literal * in the pattern with `*`.
Last but not least, to implement the `wildcard_replace() function` we needed not only to be able to match, but also to be able to replace parts of strings with captured segments. This is necessary to dynamically create HTTP redirects based on the source URL. For example, to redirect a request from `https://example.com/*/page/*` to `https://example.com/products/${1}?page=${2}`, you should be able to define the target URL using an expression like this:
wildcard_replace(
http.request.full_uri,
“https://example.com/*/page/*”,
“https://example.com/products/${1}?page=${2}”
)
This means that in order to implement this function in the Ruleset Engine, we also need our wildcard matching implementation to capture the input substrings that match the wildcard’s metacharacters.
Given these requirements, we decided to build our own wildcard matching crate. The implementation is based on Kurt’s 2016 iterative algorithm, with optimizations from Krauss’ 2014 algorithm. (You can find more information about the algorithm here). Our implementation supports byte-oriented matching, escape sequences, and capturing matched segments for further processing.
Cloudflare’s `wildcard crate` is now available and is open-source. You can find the source repository here. Contributions are welcome!
FAQs and resources
For more details on using wildcards in Rules products, please refer to our updated Ruleset Engine documentation:
We value your feedback and invite you to share your thoughts in our community forums. Your input directly influences our product and design decisions, helping us make Rules products even better.
Additionally, check out our `wildcard crate` implementation and contribute to its development.
Conclusion
The new wildcard functionality in Rules is available to all plans and is completely free. This feature is rolling out immediately, and no beta access registration required.
We are thrilled to offer this much-requested feature and look forward to seeing how you leverage wildcards in your Rules configurations. Try it now and experience the enhanced flexibility and performance. Your feedback is invaluable to us, so please let us know in community how this new feature works for you!
Source:: CloudFlare