Why Digital Archive Monitoring Matters: A National Library Case Study

August 28, 2025

3:35

Learn how continuous monitoring enabled our team to identify a bottleneck behind collection access delays at a national library and restore fast, reliable performance.

Performance Monitoring

Even with automated monitoring and performance charts in place, pinpointing the exact cause of a slowdown isn’t always straightforward—especially during peak usage times. That was the case for a national library, where Veridian’s monitoring detected that some users were experiencing delays of up to 60 seconds when trying to access their digital archives.

Diagnosing the slowdown

At first, it looked like a case of needing to beef up the server’s CPU or RAM to handle peak traffic. But deeper analysis revealed that while most user requests were being processed quickly—within three seconds—some were getting significantly delayed during busy periods. The server wasn’t crashing, but user experience was inconsistent and frustrating.

To get to the bottom of the issue, the Veridian team dug into several layers of system data:

Graphserver charts showed how long different types of requests were taking to complete.
kSar system activity reports visualized server resource usage over time.
Runtime logs were reviewed manually to identify any unfinished processes that could be clogging the system.

Digital archive monitoring uncovered the real issue

After carefully correlating logs, the team identified a clear pattern: the delays lined up with spikes in SWAP memory usage.

SWAP is like backup memory—when the server’s main RAM fills up, it stores temporary data on the hard drive. But hard drives, especially standard ones, are much slower than RAM. In this case, the server had two SWAP partitions:

A slower 15GB partition.
A faster 64GB partition.

Unfortunately, the system was defaulting to the slower 15GB volume, which was causing major slowdowns when it filled up during high-traffic periods.

Linux systems typically prioritize SWAP partitions based on a value called “priority.” If priorities aren’t manually configured, the system may default to the first SWAP device it detects—which in this case was the 15GB slower volume.

This slower partition was likely a legacy configuration—for example, one automatically created by the operating system during installation. Meanwhile, the larger, faster 64GB partition was added later, but without explicit prioritization, it wasn’t being fully utilized.

As a result, during peak traffic, the system offloaded memory to the slower device first—leading to delays. Once identified, the fix was simple.

The fix

The team disabled the slower 15GB SWAP partition and reconfigured the system to use only the faster 64GB one, which had much better IOPS (input/output operations per second). This change allowed the server to handle temporary memory more efficiently.

The result

Significantly faster performance during peak hours.
Fewer concurrent processes during peak times mean less congestion.
A major drop in memory delays: SWAP activity fell from 3400 to just 225 pages swapped per second.

Less congestion comes from fewer concurrent processes. Slide from left to right and watch the y-axis show the drop in process counts from 120 to 60.

Slide from left to right to see a noticeable improvement in performance (elapsed time) by request i.e. how long a user has to wait for something to load or complete after they click, search, or interact.

Slide from left to right to see a significant improvement in SWAP Performance, specifically the y-axis showing a fall from 3400 to just 225 pages swapped per second.

This fix was applied quickly—thanks to the experience of Veridian’s technical team—and made a visible difference to everyone using this national library’s digital collection. It’s a great example of how expert monitoring and investigation can uncover hidden issues and improve performance.

Curious how our monitoring works—and what else Veridian and our team can help you achieve? Ask us.

Why Digital Archive Monitoring Matters: A National Library Case Study

Why Digital Archive Monitoring Matters: A National Library Case Study

Diagnosing the slowdown

Digital archive monitoring uncovered the real issue

The fix

The result

Less congestion comes from fewer concurrent processes. Slide from left to right and watch the y-axis show the drop in process counts from 120 to 60.

Slide from left to right to see a noticeable improvement in performance (elapsed time) by request i.e. how long a user has to wait for something to load or complete after they click, search, or interact.

Slide from left to right to see a significant improvement in SWAP Performance, specifically the y-axis showing a fall from 3400 to just 225 pages swapped per second.

Related reading

Our newsletter