If your digitized newspapers are to be used and shared across platforms and survive into the future it is important to choose the most appropriate standards and best practices.
A large number of different formats, standards, and workflows are still used by newspaper digitization projects. They include well-known formats like PDF, as well as formats modelled around the requirements of specific access/hosting software, like CONTENTdm or the PR XML format used by Olive Software.
Since about 2007 quite clear metadata standards and best practices for newspaper digitization have emerged, namely
- METS-Metadata Encoding and Transmission Standard,
- ALTO-Analyzed Layout and Text Object, and
- Associated high-resolution images.
But what if I’ve already digitized my newspapers to a different format?
In many cases we can help migrate existing collections from formats like PDF or PR XML to METS/ALTO. If you have an existing digitized newspaper collection and would like to discuss migrating to METS/ALTO please contact us.