In 2014, we worked with the Colorado State Library to release the new Colorado Historic Newspapers Collection (CHNC) on Veridian. Since the launch, the CHNC team have more than doubled the historic newspaper content that is freely available through the site.
According to Leigh Jeremias, Digital Collections Coordinator at the Colorado State Library, their users have been very pleased with the features Veridian offers, including the multiple search options, improved Optical Character Recognition (OCR), user account creation and the Crowdsourced User Text Correction.
The states of Colorado and Wyoming have a long history of working together on digital collection initiatives, and they approached Veridian in 2020 to create a partnership, a joint collection combining the content of the CHNC, Wyoming State Library and University of Wyoming collections.
This led to the creation of the Plains to Peaks digital collection, creating one place where users from both states can search and browse the content of both the Colorado and Wyoming collections.
To build and launch the Plains to Peaks digital collection, there were two main tasks for the Veridian team: migrating and merging the data, and creating the new interface:
Migrating and merging the Wyoming newspapers into Veridian
Before this exercise, the Wyoming historical newspapers were hosted separately on two different platforms. The Wyoming State Library had a platform that allowed their users to access roughly 800,000 pages of newspaper PDFs from 1849 through to 1922. In addition, the University of Wyoming had roughly 100,000 pages of METS/ALTO National Digital Newspaper Program (NDNP) data from 1863 to 1963 on Chronicling America.
As the first step of this project, we migrated and merged both sources of Wyoming newspapers into one single Veridian site - the Wyoming Digital Newspaper Collection: https://wyomingnewspapers.org/. This exercise created a single accepted site for all Wyoming researchers to use when exploring Wyoming history.
Metadata Inheritance (Hierarchical Metadata Structure)
Another important aspect of this migration process was restructuring the record metadata into a hierarchical structure that makes it easier for future management. Often in a Digital Library or Content Management System (CMS) environment, in order to have the flexibility to handle different types of documents, the metadata is usually assigned to each individual record. This model works well with artifacts like photos, letters and books, but not for newspapers.
A newspaper issue would usually consist of publication-level metadata (e.g. publication title, publisher, place of publication, title description, etc...), issue-level metadata (e.g. issue volume, number, edition, and date, etc...), and finally section-level metadata (such as additional notes, comments, tags). Without utilizing a hierarchical metadata structure, a piece of publication-level metadata would need to be assigned repeatedly to every individual record, and there can easily be tens of thousands of issues in a single publication. Imagine the time and resource that would be required if the collection manager needed to alter a piece of publication-level metadata, such as correcting a typo in the publication title.
As part of the migration process, our team re-organized the metadata into a proper hierarchical structure (publication-level --> issue-level --> section-level), so each ‘child’ document inherits appropriate metadata values from its ‘parent’ nodes. Having the metadata organized in such a way not only makes the metadata itself more consistent and correct, but also allows metadata to be managed (or additional metadata to be added) more easily in the future.
Creation of the Plains to Peak interface
Once we had both the Colorado and Wyoming data in place, the next major phase of this project was to create the Plains to Peak interface.
Veridian Hub and Affiliate Model
To bring the Plains to Peaks digital collection to life, we utilized the Veridian Hub and Affiliate Model. A ‘hub’ is a superset of the associated collections; whereas ‘affiliates’ are different representations or subsets of the ‘hub’.
In this project, Plains to Peak is the hub and both the Colorado and Wyoming collections are the affiliates. The benefits of implementing this model are threefold:
- Shared user contribution data - all three collections now share the same user contribution database. This means when a user makes a text correction on the Colorado site, the Plains to Peak site will benefit from this correction, and vice versa.
- Simplified system maintenance - there is only one base Veridian that requires maintenance. So in future when we need to upgrade these three collections, instead of upgrading three sites separately, we need only upgrade one base plus three interfaces.
- Saved storage space - instead of making duplicates of the source between the three collections, the hub and affiliates now access the same data in this model. This reduces the space required for hosting the site as well as simplifying the backup process.
In December 2020, the Plains to Peak collection was launched. This project marked a first for Veridian: our first cross-state collection, that allows access to newspaper titles from multiple states all in one place. Across all three collections, there are currently over 854 different newspapers, published from 1849 to 2020 and over 2.9 million digitized newspaper pages, a number which is sure to grow.
For Wyoming users, through the migration process they are now able to access content more easily, as well as having access to Veridian features such as user text correction (UTC).
In building this cross-state collection, we hope that it leads to an even stronger user community to support all three collections.