Libraries, archives, and historical societies work hard to make unique collections available online. These collections are valuable public resources, but they can also attract malicious traffic and be targeted by:
-
Automated bots attempting to harvest large amounts of content
-
Denial-of-service attacks (explained below) that overwhelm servers
-
Malicious attempts to exploit site vulnerabilities
Without appropriate protections, these activities can slow down the site, increase hosting costs, and, in severe cases, make the collection temporarily unavailable. They can also lead to unauthorized copying and misuse of copyrighted material.
For many of the collections we host and support, Cloudflare is an important part of our security and performance strategy. It is one of several tools we use to help safeguard our clients’ content and ensure secure and reliable access.
What is Cloudflare?
Cloudflare is one of the world’s most widely used Content Delivery Networks and web protection platforms, trusted by millions of websites (approximately one in five websites worldwide), including governments, financial institutions, universities, libraries, and major global organizations.
It acts like a protective layer between sites and their visitors, helping keep content secure while ensuring collections remain responsive and reliably accessible.

How Cloudflare protects historical digital collections
Below we’ve outlined what we consider to be the key benefits of using Cloudflare:
1. Traffic monitoring and threat identification
Cloudflare continuously analyzes traffic across its global network to identify suspicious behavior and known malicious sources. It uses this threat intelligence to automatically filter known harmful traffic before it reaches your collection. This includes blocking requests from IP addresses, networks, and automated tools associated with cyberattacks, scraping, and other abusive activity.
Our team uses these capabilities, where Cloudflare is deployed, to monitor traffic patterns, investigate unusual activity, and configure custom security rules. When we identify new scraping techniques or malicious sources, we can quickly update these rules.
2. Faster access
Where enabled, Cloudflare stores copies (or caches) of frequently accessed content on servers around the world through its Content Delivery Network. This helps collections load faster and respond more quickly for users wherever they are located. By serving much of this content directly, Cloudflare also reduces the load on the collection’s servers, helping lower bandwidth usage and associated hosting costs.
3. Protection against common attacks
Cloudflare protects against common attacks such as Distributed Denial of Service (DDoS) attacks, which attempt to overwhelm a site with excessive traffic; SQL injection attempts designed to manipulate a website’s database; brute-force attacks intended to crack passwords; and other widely recognized vulnerabilities, including those listed in the OWASP Top 10.
4. Defense against automated scraping and bad bots
Some bots are helpful, such as search engines that index content for discovery. Others are designed to copy large amounts of material, overload servers, or misuse website resources. Cloudflare uses machine learning and behavioral analysis to distinguish legitimate users from malicious bots. It also offers innovative tools to deter scrapers, including the ability to serve realistic AI-generated content to suspected bots trapping them in a virtual labyrinth.
Once again, our team actively monitors bot activity and, where Cloudflare is deployed, updates security rules when we identify new scraping techniques.
Related reading: Bot Wars 4.0
5. Custom security rules
Where Cloudflare is used, it allows us to create custom security rules that help control how visitors access the collections. These rules can automatically slow down, challenge, or block suspicious traffic when needed, giving us an additional layer of protection against scraping, abuse, and other malicious activity.
Custom security rules offer fine-grained control over who is detected, what actions to take when they are, and can even be used as means to whitelist certain actions or users if they need heightened access to the collection (for example automated monitoring or reporting tools like SiteImprove).
6. Shared protection across all collections
When we identify a new scraping technique or malicious source, we can update our Cloudflare rules so that other collections using the same protection can benefit from the same improvements.
This shared intelligence means each client benefits from lessons learned across our entire network of collections.
Frequently asked questions (FAQs)
Does every Veridian collection use Cloudflare?
Not necessarily. Many of the collections we host and support use Cloudflare as part of their security and performance setup, while others may use different arrangements depending on their hosting environment and specific requirements.
Does Cloudflare cost extra?
For clients who host their own collections, a separate Cloudflare subscription is typically required, and our team can assist with setup and configuration.
For collections hosted by Veridian, a baseline level of Cloudflare protection is often included at no additional cost. Clients may also choose to implement more advanced protection features, which may involve additional charges depending on their specific requirements.
Do we need to configure or manage Cloudflare ourselves?
Whether your collection is hosted by Veridian or within your own infrastructure, our team can manage Cloudflare on your behalf, including monitoring traffic, configuring security rules, and responding to emerging threats.
Because our team has visibility across the Cloudflare environments we manage, we can quickly apply lessons learned from one collection to others. When we identify new scraping techniques or malicious traffic patterns, we can update security rules across multiple collections, helping reduce vulnerabilities and strengthen protection for all clients using Cloudflare.
Will this affect how researchers and other users access our collection?
In most cases, users will not notice Cloudflare at all. The service works behind the scenes to improve performance and block suspicious traffic. Occasionally, visitors displaying unusual behavior may be asked to complete a simple verification step.
Will Cloudflare make our collection faster?
Yes. Cloudflare caches frequently accessed content on servers around the world, which can improve page load times and responsiveness, especially for international users.
Does Cloudflare stop all scraping and cyberattacks?
There isn't a security solution that can prevent every attack. However, Cloudflare significantly reduces the impact of common threats such as DDoS attacks, automated scraping, and attempts to exploit website vulnerabilities.
Even when bad actors make it past Cloudflare's automated protection through some novel technique, the platform offers analysis and visualisations that help us identify them and craft custom rules to prevent those specific approaches.
Does Cloudflare block search engines like Google?
No. Search engines such as Google are recognized as legitimate bots and are allowed to continue indexing your collection for discovery.
Does Cloudflare affect search engine optimization (SEO)?
No. Cloudflare is widely used by websites that depend on strong search visibility. When configured correctly, it has no negative impact on SEO and can even improve performance, which may benefit search rankings.
How will we know if our collection is being targete
Veridian monitors traffic and security alerts on an ongoing basis. If significant issues arise, our team investigates and takes action to protect the collection.