This article explains the latest developments and features added to Veridian.
Our digital collection software is continuously developed, polished and enhanced. We research the latest technologies, gather feedback from our clients and incorporate both into the core Veridian platform.
So, what's new?
1. IIIF Image and Presentation API support
By enabling IIIF viewing of more Veridian collections, we hope to help stimulate the development of IIIF for online newspapers. Once added, any IIIF-compliant viewer or application can consume and display the images, text, and metadata in a collection.
IIIF allows interoperability across and between repositories and could encourage third parties to use your content in new and interesting ways.
See IIIF in action at the Swiss National Library’s digital newspaper archive. Follow the example links below to see the IIIF icons that generate the IIIF API output for that particular publication/document.
https://www.e-newspaperarchives.ch/?a=cl&cl=CL1
https://www.e-newspaperarchives.ch/?a=d&d=LBP18881103-01
2. Ingestion of born-digital PDFs
Born-digital PDFs can be directly ingested into Veridian collections, providing more scope for the inclusion of contemporary publications, without the need to go through the METS/ALTO conversion process.
Born-digital PDFs will be ingested as page level documents, but when/if article segmentation becomes available in the future, these issues can be replaced as part of the annual support and maintenance service.
The example link shows a contemporary born-digital PDF newspaper issue included in Lehigh University’s The Brown and White student newspaper archive:
https://bwarchive.lib.lehigh.edu/?a=d&d=BW20200407
3. User Text Correction (UTC) Dashboard
The UTC dashboard brings together key information on one page to raise the profile of text correction and encourage users to get involved.
Included are ‘how-to’ guides and ‘recommended items to be corrected’ to help users choose which content to work on. The dashboard also displays text correction statistics to track the progress of community contributions and reward the most active participants with a ‘text correctors hall of fame’ placing.
See an example UTC dashboard page on the Vassar College Digital Newspaper & Magazine Archive:
https://newspaperarchives.vassar.edu/?a=p&p=textcorrecthome
4. Visualisation of search results in a bar graph
This feature presents search results in a bar graph, grouped by publication date. The bar graph sits alongside the search result snippets.
By hovering over the graph, users can see at a glance the number of search results returned within each year. Graph results can be filtered by decade.
See an example of the search results bar graph on the Vassar College Digital Newspaper & Magazine Archive:
5. Tag suggestions
Registered users can add public tags to articles which may be browsed and used to narrow down searches.
This new ‘tag suggestion’ feature assists users by displaying a list of suggested tags once two or more characters have been typed. The suggestion list is derived from existing tags, and optionally from a tag dictionary preset by the collection owner.
The tag suggestion feature can be seen on the Swiss National Library’s digital newspaper archive:
https://www.e-newspaperarchives.ch/?a=d&d=LBP18881103-01.2.1
6. Strong password re-enforcement
Strong password re-enforcement will be enabled for new user registrations, or if existing users choose to change their passwords. Passwords must be a minimum of 8 characters long and contain at least one letter, one number, and one other character (punctuation/symbol/space).
An example of a collection with strong registration password re-enforcement can be seen here:
7. Anti-scraping package
We’ve seen a recent rise in collection scraping activity, where PDF content has been harvested from Veridian collections for use on a subscription based newspaper website.
To help prevent unauthorized content scraping, we are offering the following options. These can be implemented separately, in combination or as a package.
(a) Enable reCAPTCHA on PDF downloading
This solution is designed to stop robots from downloading PDFs. It’s important to note that we are using reCAPTCHA v2 (Invisible reCAPTCHA badge) which does not require the user to click on the “I am not a robot” checkbox.
The implementation will not complicate or alter the user experience of downloading a PDF, other than adding a few extra seconds for the background reCAPTCHA check and redirect to the PDF.
An example of reCAPTCHA on PDF downloading can be seen here:
(b) PDF downloads behind registration
This will restrict PDFs downloads to registered users. It will increase the difficulty for the robots to harvest the content but it will not stop it completely.
An example of PDF downloads behind user registration can be seen here:
https://www.e-newspaperarchives.ch/?a=d&d=LBP18881103-01.2.1
(c) Adding reCAPTCHA to the user registration process, if not enabled already
As with option (a), we will use reCAPTCHA v2 (Invisible reCAPTCHA badge) which does not require the user to click on the “I am not a robot” checkbox.
An example of user registration with reCAPTCHA enabled can be seen here:
(d) Adding the new ‘terms of use’ page, with the option to modify the text to suit individual collection needs
For collections without current ‘terms of use’ information displayed, we will implement a new generic ‘terms of use page’. A link to the page will be placed at the collection footer, alongside the ‘privacy policy’ link.
We will include some generic ‘terms of use text’ which prohibits the commercial use of collection content and indicates that it may only be used for private purposes. This text may be modified to suit individual collections. Please seek the advice of your legal team for collection specific wording.
An example of the ‘terms of use’ link can be seen on the footer of the Kent Stater Archive collection: https://dks.library.kent.edu/
Please let your Veridian developer know if you have any questions about the new developments and features mentioned here.
Questions or comments?
Feel free to get in touch.