Libraries, archives, and historical societies work hard to make unique collections available online. These collections are valuable public resources, but they can also attract malicious traffic and be targeted by:
Automated bots attempting to harvest large amounts of content
Denial-of-service attacks (explained below) that overwhelm servers
Malicious attempts to exploit site vulnerabilities
Without appropriate protections, these activities can slow down the site, increase hosting costs, and, in severe cases, make the collection temporarily unavailable. They can also lead to unauthorized copying and misuse of copyrighted material.
For many of the collections we host and support, Cloudflare is an important part of our security and performance strategy. It is one of several tools we use to help safeguard our clients’ content and ensure secure and reliable access.
Cloudflare is one of the world’s most widely used Content Delivery Networks and web protection platforms, trusted by millions of websites (approximately one in five websites worldwide), including governments, financial institutions, universities, libraries, and major global organizations.
It acts like a protective layer between sites and their visitors, helping keep content secure while ensuring collections remain responsive and reliably accessible.
Below we’ve outlined what we consider to be the key benefits of using Cloudflare:
Cloudflare continuously analyzes traffic across its global network to identify suspicious behavior and known malicious sources. It uses this threat intelligence to automatically filter known harmful traffic before it reaches your collection. This includes blocking requests from IP addresses, networks, and automated tools associated with cyberattacks, scraping, and other abusive activity.
Our team uses these capabilities, where Cloudflare is deployed, to monitor traffic patterns, investigate unusual activity, and configure custom security rules. When we identify new scraping techniques or malicious sources, we can quickly update these rules.
Where enabled, Cloudflare stores copies (or caches) of frequently accessed content on servers around the world through its Content Delivery Network. This helps collections load faster and respond more quickly for users wherever they are located. By serving much of this content directly, Cloudflare also reduces the load on the collection’s servers, helping lower bandwidth usage and associated hosting costs.
Cloudflare protects against common attacks such as Distributed Denial of Service (DDoS) attacks, which attempt to overwhelm a site with excessive traffic; SQL injection attempts designed to manipulate a website’s database; brute-force attacks intended to crack passwords; and other widely recognized vulnerabilities, including those listed in the OWASP Top 10.
Some bots are helpful, such as search engines that index content for discovery. Others are designed to copy large amounts of material, overload servers, or misuse website resources. Cloudflare uses machine learning and behavioral analysis to distinguish legitimate users from malicious bots. It also offers innovative tools to deter scrapers, including the ability to serve realistic AI-generated content to suspected bots trapping them in a virtual labyrinth.
Once again, our team actively monitors bot activity and, where Cloudflare is deployed, updates security rules when we identify new scraping techniques.
Related reading: Bot Wars 4.0
Where Cloudflare is used, it allows us to create custom security rules that help control how visitors access the collections. These rules can automatically slow down, challenge, or block suspicious traffic when needed, giving us an additional layer of protection against scraping, abuse, and other malicious activity.
Custom security rules offer fine-grained control over who is detected, what actions to take when they are, and can even be used as means to whitelist certain actions or users if they need heightened access to the collection (for example automated monitoring or reporting tools like SiteImprove).
When we identify a new scraping technique or malicious source, we can update our Cloudflare rules so that other collections using the same protection can benefit from the same improvements.
This shared intelligence means each client benefits from lessons learned across our entire network of collections.