The National Library of New Zealand’s (NLNZ) multilingual Papers Past collection delivers digitised full-text New Zealand and Pacific newspapers, magazines and journals, letters and diaries, parliamentary papers and now books from the 19th and 20th centuries. Over time, it has grown to include over one million documents across these formats, with close to 8 million total pages.
The Veridian team has been working with Papers Past since 2006, ingesting data into and customising their use of the Veridian digital collection software to help create Papers Past on the web. Recently, we spoke with Melanie Lovell-Smith and Emerson Vandy from the National Library of New Zealand on their journey to launch their new Ngā Tānga Reo Māori (Māori language publications) collection and books interface.
Origins of the project
Back in 2013 and 2015, NLNZ completed two pilot projects to digitise a selection of books, resulting in 330 titles joining their National Digital Heritage Archive (NDHA). Whilst they were available to the public on this platform, they weren’t particularly discoverable.
Later, the team began work on the ‘Books in Māori’ pilot, linking back to the creation of a bibliography in the early 2000s which listed over 1,500 items printed in te reo Māori from 1815 to 1900. Work was undertaken to ascertain how many of these titles were digitised, and thus how many remained on the to-do list.
It’s important to note that the ‘Books in Māori’ project title is something of a misnomer as the bibliography contains books, ephemera, newspapers and petitions, some of which had already been digitised as part of earlier projects. This is the reason that the Māori title ‘Ngā Tānga reo Māori’ is used on Papers Past, rather than the working pilot title ‘Books in Māori,’ as it encompasses a broader range of material.
The NLNZ team held a hui (Māori for meeting) with significant Māori academics and subject experts from libraries, and pulled together this pilot to see if they could make all of the 19th century te reo Māori material available online. As part of this, the team extended the Papers Past platform to display the books in Māori in particular. This also allowed other digitised books held by NLNZ to be made available on the platform.
Building the collection
Data processing and digitisation
Looking at the data behind the books from the two original pilot projects, whilst they are all in METS/ALTO, the metadata and naming conventions were created differently, digitised at the page-level as opposed to the now article-level. So there was a bit of data-wrangling required, working with these disparate data standards (typical of pilots) to put the collection together.
In terms of data processing and digitising the collection, this generally followed NLNZ’s normal processes which we discussed in a previous article. However, they did have to handle the scanning process differently. Usually, NLNZ sends its newspaper and magazines to their vendors using either microfilm or a spare set. As the ‘Books in Māori’ content is so precious, it couldn’t leave the building so they had to arrange a vendor to come in and photograph it on-site. Once the digital images were provided, these were sent for processing as per usual.
NLNZ also had to be careful in handling the books due to their age and bindings, and often it was a complicated process to get pages flat. Something they learned for future projects is the need to get a conservator on board early. The collection includes some historical posters, which arrived at the National Library folded up in tiny books. The conservator decided to unbind these books and flatten the poster like the one below, which allowed the vendor to get much better images of the material. These are now stored flat, which will be much better for the physical material long-term.
Ngā Tānga Reo Māori collection and books interface
To conceive and build the books interface, Emerson from NLNZ started by forming some high fidelity wireframes of what the potential interface could look like. He used the CSS from their Letters and Diaries section, duplicated it, changed the colour and started playing with the filters based on the books’ metadata they had available. These were then used for iterative user testing over Zoom (due to Covid), to get a feel for what people liked and disliked.
After four or five rounds of user testing, from paper prototype, to basic working HTML interface to a polished one, the result was a minimum viable interface for accessing the books material; all without the input of a designer.
A key decision that was made based on user feedback was to display the entire book, rather than have it available for an EPUB/e-book download. The feedback was that this was very much a research collection rather than for leisure reading, thus users weren’t interested in downloading the books onto an e-reader (for example). They also found that users weren’t interested in the books presenting as ‘real’ books, where you can flip the pages, so they were presented as you can see today.
Where they’ve landed is that the first 20 pages of a book display when viewed, then users can click through to the rest of the content. This decision was seen as a trade-off point, so users see a reasonable selection of the book material and can access the entirety by saving a copy as a PDF, but it doesn’t load up so much content that the page load times balloon out.
The downloadable PDFs offered on the platform are at a higher colour depth than the ones delivered on the web interface itself, to save on page render time. Page colour and appearance is also an artefact of the digitisation pilots, completed to different colour standards at various points in time.
A new feature developed for Papers Past as part of the books collection is the availability of bibliographical details in a range of citation formats (APA, Chicago, MLA) for each publication. For this first iteration, the team used an extract from the National Library catalogue as the source data. Due to interest in this feature appearing in the other collections on Papers Past, further development will be explored.
Another defining interface feature is having the search and browse functions conflated onto one page. As a user, if you adjust the browse publish date slider at the top of the page, the filter details under search on the left and the results displayed will adjust in real time.
This is viable due to the smaller size of the books collection (compared to the newspapers collection), with 422 books so far, and was actually a ‘happy accident’ when building and testing the interface. When it was discovered, the team was excited about the interactivity it provided, and overall user testing showed that the majority preferred this new type of interface. Thus, it stayed and the NLNZ team will likely consider it as a possible way forward for future collections.
The subject filter on the books collection is also a significant point of difference compared to the rest of Papers Past. For each of the works involved, there’s a range of subjects available for each title within the NLNZ catalogue. During user testing, NLNZ received feedback that people wanted some form of subject-based filtering or browsing. The idea of manually curating subject data was out of the question due to time and budget constraints, so the team pulled the data from their catalogue. This did create some inconsistencies and compromise was required; it’s more of a ‘subject snapshot’ where users can free text type to find subject options and select one that matches, which then filters the results.
To ensure that proper cultural protocols were observed during this project, NLNZ recruited Clare Butler (Ngāti Kahungunu ki Wairoa) as the Māori engagement advisor. Her role was to look through the content being digitised, reach out to the whānau (extended family, family group) or hapū (kinship group, clan, tribe, subtribe) and advise them of what was happening and make sure there were no concerns from their perspective. This was key for relationship building and confirming each book was acknowledged or handled in the right way in the eyes of the community.
As part of these considerations and because the books are no longer in copyright according to New Zealand legislation, NLNZ also developed a copyright statement to communicate that other considerations apply to any reuse of the material. This statement says that the Library acknowledges these taonga (treasured possessions) have a spirit and connection to their ancestors. It asks users to treat them with respect and follow the guidelines around caring for Māori material.
Looking at the site’s navigability, it’s important to understand that Papers Past started as a website for historic newspapers; its redesign in 2016 allowed for other formats such as magazines and journals, but the search functions remained specific to each format. As mentioned above, these formats already contained some of the material listed in ‘Books in Māori.’ So, as well as needing to create a books interface to display the books component of the collection (as they could already display the other content types on Papers Past), the team needed to work out a way of navigating through the ‘Books in Māori’ content across the site.
In creating the Ngā Tānga Reo Māori collection, NLNZ engaged a design vendor to work formally with stakeholders and potential users about how they might integrate the horizontal Ngā Tānga Reo Māori collection across separated content types. The result was an interface design change from previous versions of Papers Past, where underneath the search text input box users can set the scope to either all content or Ngā Tānga Reo Māori content.
On an individual piece of content, from the search sidebar, users can initiate subsequent searches within that content, or within all content or within Ngā Tānga Reo Māori. This was a tricky piece of script to develop, but helped create the horizontality of search across this collection, within all the other content types.
One content type was not included in this change, Letters and Diaries, as it’s not part of the Ngā Tānga Reo Māori collection. However, the NLNZ team are looking to explore whether people would like a search that covers all te reo Māori content, or whether to keep it to just content specifically included in the ‘Books in Māori’ bibliography (as it is currently).
As well as the new search functionality, the new design brought te reo back to the website, with the inclusion of a mihi (a welcome) on the home page, specially written for Papers Past and the choice to have the instructions presented in te reo Māori when using the Ngā Tānga Reo Māori search.
Significance of this collection
The Ngā Tānga Reo Māori collection is hugely significant as it reflects the history of Aotearoa (New Zealand) in the 1800s, and records te reo Māori at a point in time when a totally oral language was first being translated into a written form. Because of this, particularly early on, the spelling of words often varies and different styles were used to indicate long and short vowel sounds. In addition, the writers of these works tried out typographic experiments to indicate sounds such as ‘ng.’ The collection also offers important insights into the mita (dialects) in use by different iwi (extended kinship group, descended from a common ancestor and associated with a distinct territory) at the time.
Future plans for the collection
Since launching the Ngā Tānga Reo Māori collection at the end of 2021, the NLNZ team has received a range of positive feedback from key stakeholders as well as members of the public, including praise for the interface design. They recently set up a Google Analytics segment for the collection and since launch, it’s had approximately 70,000 sessions and 1.2 million page views across all the interfaces.
Melanie from NLNZ is currently managing the delivery of the remainder of the original 1,500 items listed in ‘Books In Māori,’ which is a multi-year project DL Consulting is currently assisting with. One of the recent pieces of work DL Consulting has completed is to convert nine books that were delivered to NLNZ as PDFs into METS/ALTO, so that NLNZ can evaluate this as a method for content delivery in the future.