An Update on CDNJS
When you loaded this blog, a file was delivered to your browser called jquery-3.2.1.min.js. jQuery is a library which makes it easier to build websites, and was at one point included on as many as 74.1% of all websites. A full eighteen million sites include jQuery and other libraries using one of the most popular tools on Earth: CDNJS. Beginning about a month ago Cloudflare began to take a more active role in the operation of CDNJS. This post is here to tell you more about CDNJS’ history and explain why we are helping to manage CDNJS.
What CDNJS Does
Virtually every site is composed of not just the code written by its developers, but also dozens or hundreds of libraries. These libraries make it possible for websites to extend what a web browser can do on its own. For example, libraries can allow a site to include powerful data visualizations, respond to user input, or even get more performant.
These libraries created wondrous and magical new capabilities for web browsers, but they can also cause the size of a site to explode. Particularly a decade ago, connections were not always fast enough to permit the use of many libraries while maintaining performance. But if so many websites are all including the same libraries, why was it necessary for each of them to load their own copy?
If we all load jQuery from the same place the browser can do a much better job of not actually needing to download it for every site. When the user visits the first jQuery-powered site it will have to be downloaded, but it will already be cached on the user’s computer for any subsequent jQuery-powered site they might visit.
The first visit might take time to load:
But any future visit to any website pointing to this common URL would already be cached:
<!-- Loaded only on my site, will need to be downloaded by every user --> <script src="./jquery.js"></script> <!-- BETTER, loaded from a common location across many sites --> <script src="https://cdnjs.cloudflare.com/jquery.js"></script>
Beyond the performance advantage, including files this way also made it very easy for users to experiment and create. When using a web browser as a creation tool users often didn’t have elaborate build systems (this was also before npm), so being able to include a simple script tag was a boon.
The project was quickly wildly successful, far beyond their expectations. Their CDN bill quickly began to skyrocket (it would now be over a million dollars a year on AWS). Under the threat of having to shut down the service, Cloudflare was approached by the CDNJS team to see if we could help. We agreed to support their efforts and created cdnjs.cloudflare.com which serves the CDNJS project free of charge.
Spikes can happen when a very large or popular site installs CDNJS, or when a misbehaving web crawler discovers a CDNJS link.
The future value of CDNJS is now in doubt, as web browsers are beginning to use a separate cache for every website you visit. It is currently used on such a wide swath of the web, however, it is unlikely it will be disappearing any time soon.
How CDNJS Works
CDNJS starts with a Github repo. That project contains every file served by CDNJS, at every version which it has ever offered. That’s 182 GB without the commit history, over five million files, and over 1.5 billion lines of code.
To make changes to CDNJS a commit has to be made. For new projects being added to CDNJS, or when projects change significantly, these commits are made by humans, and get reviewed by other humans. When projects just release new versions there is a bot made by Peter and maintained by Sven which sucks up changes from npm and automatically creates commits.
Within Cloudflare’s infrastructure there is a set of machines which are responsible for pulling the latest version of the repo periodically. Those machines then become the origin for cdnjs.cloudflare.com, with Cloudflare’s Global Load Balancer automatically handling failures. Cloudflare’s cache automatically stores copies of many of the projects making it possible for us to deliver them quickly from all 195 of our data centers.
The Internet on a Shoestring Budget
The CDNJS project has always been administered independently of Cloudflare. In addition to the founders, the project has additionally been maintained by exceptionally hard-working caretakers like Peter and Matt Cowley. Maintaining a single repo of nearly every frontend project on Earth is no small task, and it has required a substantial amount of both manual work and bot development.
Unfortunately approximately thirty days ago one of those bots stopped working, preventing updated projects from appearing in CDNJS. The bot’s open-source maintainer was not able to invest the time necessary to keep the bot running. After several weeks we were asked by the community and the CDNJS founders to take over maintenance of the CDNJS repo itself. This means the Cloudflare engineering team is taking responsibility for keeping the contents of github.com/cdnjs/cdnjs up to date, in addition to ensuring it is correctly served on cdnjs.cloudflare.com.
We agreed to do this because we were, frankly, scared. Like so many open-source projects CDNJS was a critical part of our world, but wasn’t getting the attention it needed to survive. The Internet relies on CDNJS as much as on any other single project, losing it or allowing it to be commandeered would be catastrophic to millions of websites and their visitors. If it began to fail, some sites would adapt and update, others would be broken forever.
CDNJS has always been, and remains, a project for and by the community. We are invested in making all decisions in a transparent and inclusive manner. If you are interested in contributing to CDNJS or in the topics we’re currently discussing please visit the CDNJS Github Issues page.
A Plan for the Future
One example of an area where we could use your help is in charting a path towards a CDNJS which requires less manual moderation. Nothing can replace the intelligence and creativity of a human (yet), but for a task like managing what resources go into a CDN, it is error prone and time consuming. At present a human has to review every new project to be included, and often has to take additional steps to include new versions of a project.
As a part of our analysis of the project we examined a snapshot of the still-open PRs made against CDNJS for several months:
The vast majority of these PRs were changes which ultimately passed the automated review but nevertheless couldn’t be merged without manual review.
There is consensus that we should move to a model which does not require human involvement in most cases. We would love your input and collaboration on the best way for that to be solved. If this is something you are passionate about, please contribute here.
Our plan is to support the CDNJS project in whichever ways it requires for as long as the Internet relies upon it. We invite you to use CDNJS in your next project with the full assurance that it is backed by the same network and team who protect and accelerate over twenty million of your favorite websites across the Internet. We are also planning more posts diving further into the CDNJS data, subscribe to this blog if you would like to be notified upon their release.