How do you know the code your web browser downloads when visiting a website is the code the website intended you to run? In contrast to a mobile app downloaded from a trusted app store, the web doesn’t provide the same degree of assurance that the code hasn’t been tampered with. Today, we’re excited to be partnering with WhatsApp to provide a system that assures users that the code run when they visit WhatsApp on the web is the code that WhatsApp intended.
With WhatsApp usage in the browser growing, and the increasing number of at-risk users — including journalists, activists, and human rights defenders — WhatsApp wanted to take steps to provide assurances to browser-based users. They approached us to help dramatically raise the bar for third-parties looking to compromise or otherwise tamper with the code responsible for end-to-end encryption of messages between WhatsApp users.
So how will this work? Cloudflare holds a hash of the code that WhatsApp users should be running. When users run WhatsApp in their browser, the WhatsApp Code Verify extension compares a hash of that code that is executing in their browser with the hash that Cloudflare has — enabling them to easily see whether the code that is executing is the code that should be.
The idea itself — comparing hashes to detect tampering or even corrupted files — isn’t new, but automating it, deploying it at scale, and making sure it “just works” for WhatsApp users is. Given the reach of WhatsApp and the implicit trust put into Cloudflare, we want to provide more detail on how this system actually works from a technical perspective.
Before we dive in, there’s one important thing to explicitly note: Cloudflare is providing a trusted audit endpoint to support Code Verify. Messages, chats or other traffic between WhatsApp users are never sent to Cloudflare; those stay private and end-to-end encrypted. Messages or media do not traverse Cloudflare’s network as part of this system, an important property from Cloudflare’s perspective in our role as a trusted third party.
Making verification easier
Hark back to 2003: Fedora, a popular Linux distribution based on Red Hat, has just been launched. You’re keen to download it, but want to make sure you have the “real” Fedora, and that the download isn’t a “fake” version that siphons off your passwords or logs your keystrokes. You head to the download page, kick off the download, and see an MD5 hash (considered secure at the time) next to the download. After the download is complete, you run `md5 fedora-download.iso` and compare the hash output to the hash on the page. They match, life is good, and you proceed to installing Fedora onto your machine.
But hold on a second: if the same website providing the download is also providing the hash, couldn’t a malicious actor replace both the download and the hash with their own values? The
md5 check we ran above would still pass, but there’s no guarantee that we have the “real” (untampered) version of the software we intended to download.
There are other approaches that attempt to improve upon this — providing signed signatures that users can verify were signed with “well known” public keys hosted elsewhere. Hosting those signatures (or “hashes”) with a trusted third party dramatically raises the bar when it comes to tampering, but now we require the user to know who to trust, and require them to learn tools like GnuPG. That doesn’t help us trust and verify software at the scale of the modern Internet.
This is where the Code Verify extension and Cloudflare come in. The Code Verify extension, published by Meta Open Source, automates this: locally computing the cryptographic hash of the libraries used by WhatsApp Web and comparing that hash to one from a trusted third-party source (Cloudflare, in this case).
We’ve illustrated this to make how it works a little clearer, showing how each of the three parties — the user, WhatsApp and Cloudflare — interact with each other.
Broken down, there are four major steps to verifying the code hasn’t been tampered with:
If the hashes match, as they should under almost any circumstance, the code is “verified” from the perspective of the extension. If the hashes don’t match, it indicates that the code running on the user’s browser is different from the code WhatsApp intended to run on all its user’s browsers.
Security needs to be convenient
It’s this process — and the fact that is automated on behalf of the user — that helps provide transparency in a scalable way. If users had to manually fetch, compute and compare the hashes themselves, detecting tampering would only be for the small fraction of technical users. For a service as large as WhatsApp, that wouldn’t have been a particularly accessible or user-friendly approach.
This approach also has parallels to a number of technologies in use today. One of them is Subresource Integrity in web browsers: when you fetch a third-party asset (such as a script or stylesheet), the browser validates that the returned asset matches the hash described. If it doesn’t, it refuses to load that asset, preventing potentially compromised scripts from siphoning off user data. Another is Certificate Transparency and the related Binary Transparency projects. Both of these provide publicly auditable transparency for critical assets, including WebPKI certificates and other binary blobs. The system described in this post doesn’t scale to arbitrary assets – yet – but we are exploring ways in which we could extend this offering for something more general and usable like Binary Transparency.
Our collaboration with the team at WhatsApp is just the beginning of the work we’re doing to help improve privacy and security on the web. We’re aiming to help other organizations verify the code delivered to users is the code they’re meant to be running. Protecting Internet users at scale and enabling privacy are core tenets of what we do at Cloudflare, and we look forward to continuing this work throughout 2022.