From tens of thousands to more than 100 million pages, Veridian Software is designed to support digital newspaper collections at any scale—without compromising performance or usability.
In the GLAM sector, scalability is often talked about in the context of very large collections. This is something we work with every day — supporting digital newspaper collections ranging from tens of thousands of pages to several million pages, without compromising performance or usability. In fact, the Veridian Software platform is capable of supporting collections with more than 100 million pages.
Over more than 20 years, we’ve been actively honing and refining Veridian’s performance as collections have grown, usage patterns have changed, and expectations have increased. Scalability is not something we “set and forget” — it is an ongoing process that continues today and into the future.
That long-term focus on performance is what enables Veridian to support very large collections in practice. Today, the platform is capable of supporting collections with more than 100 million pages.
Many platforms can handle 50,000 or even 500,000 pages, but far fewer are designed to support collections that grow into the millions, as Veridian does.
But scalability is about more than size alone
Scalability is not about making a very large collection perform identically to a small one. Larger collections naturally introduce more complexity — including larger indexes, higher concurrency (more users accessing the system at the same time), and more demanding search and browsing patterns.
What matters is how well performance scales relative to that growth, and whether the software platform can be continuously tuned as complexity increases to maintain a consistent and stable user experience.
Designed for ongoing growth
Digitisation projects are often treated as if they have a finish line. In reality, launching a collection is just the beginning.
In our experience, digital newspaper collections rarely stand still. New content is added over time, metadata is refined, and usage patterns continue to evolve. As collections grow and mature, it’s critical that the software supporting them continues to perform reliably — without slowing down or becoming difficult to use.
Why scalability matters to users
Today’s users have been conditioned by modern search engines and digital platforms. Whether a collection contains 50,000 pages or more than 150 million, user expectations are the same — fast search, reliable access, and responsive discovery.
As collections grow, it becomes increasingly important that the platform continues to behave in ways users recognise and trust, even as the underlying complexity increases. When performance noticeably degrades, users don’t interpret this as a consequence of scale — they assume something is wrong
Systems that are not designed to scale effectively often show signs of strain over time, such as:
- Longer wait times after submitting a search.
- Inconsistencies in search result ranking.
- Lags when browsing using filters, such as publication date or newspaper title.
- Errors or timeouts during periods of high concurrent usage.
- User features — such as clipping, tagging, or commenting on articles — becoming slow or unresponsive.
The Veridian Software platform is designed to minimise these issues through efficient scaling and ongoing performance tuning as data volumes and usage increase.
Collection spotlight: LLMC — over 100 million pages online
The LLMC collection provides online access to more than 100 million pages of legal and government materials. It includes over 180,000 volumes across 40,000 titles, with coverage spanning more than 185 countries and 65 languages.
Software scalability and infrastructure capacity
Veridian’s software is designed to scale efficiently within a given environment, handling large page volumes and search indexes without introducing bottlenecks. However, overall performance during periods of high usage is also influenced by the underlying hosting infrastructure.
An everyday way to think about this is a supermarket self-checkout. Veridian’s software is the checkout system itself — designed to process items quickly and efficiently. The hosting infrastructure is the number of self-checkouts operating. If customer numbers increase but the number of checkouts stays the same, queues form — not because the checkout system is inefficient, but because capacity hasn’t been adjusted.
For this reason, Veridian offers managed hosting built on Amazon Web Services, where infrastructure can be sized and adjusted to meet demand. Some clients choose to use their own hosting infrastructure, and in those cases we work closely with them to ensure it is appropriately provisioned. Without sufficient capacity, performance issues related to high user activity can occur — not because of the software, but because of infrastructure limits.
This approach gives libraries flexibility, while ensuring scalability is addressed holistically across both software and hosting.
Find out how Veridian Software can support your collection