How to upload data to Veridian via an S3 bucket

Due to the nature of our work, clients will often need to transfer large numbers of files to our team, whether it's to upload source images to be digitised, or data to be ingested into a Veridian collection such as METS/ALTO files or born-digital PDFs.

While there are several ways that clients like to transfer data to us, including Google Drive, Dropbox or other file transfer tools, we’ve found the best way to transfer data is via S3. S3 is a secure object storage service offered by Amazon Web Services (AWS), thus an S3 bucket is a cloud storage space within S3.

Here we will explain how to upload data to an S3 bucket via DragonDisk (an S3 compatible client) on Windows using the following sample information:

S3 Bucket: s3://upload.4225.dlconsulting.com
Remote Path: /upload.4225.dlconsulting.com
Username/Access Key ID: AccessKeyId (will be provided to you)
Pass Phrase/Access Key Secret: SecretAccessKey (will be provided to you)

Note that as you work through the process, there are two methods of upload described, copy and synchronisation. While both work well, synchronisation has some important advantages:

when uploading large amounts of data, if the connection is interrupted there is no need to start from scratch. Instead, synchronisation will resume the upload of data where it was interrupted, potentially saving a significant amount of time.
synchronisation also provides the option of excluding the upload of certain files, e.g. if large archival TIFFs need to be excluded from the upload.

Installing DragonDisk

Download and install a DragonDisk installer appropriate for the operating system on your computer.

http://www.s3-client.com/download-s3-compatible-cloud-client.html

Setting up an account in DragonDisk

Launch DragonDisk and from the 'File' menu, choose 'Accounts...'

Screenshot of DragonDisk 'File' menu, 'Accounts...' option highlighted.

From the 'Accounts' dialog, click on 'New' and enter the credentials:

Account Name: Veridian-Sample-S3bucket (Any name that makes sense to you).

Access Key: Enter the AccessKeyID provided to you.

Secret Key: Enter the SecretAccessKey provided to you.

Screenshot showing 'Accounts' dialog and 'Account' dialog for creating an account.

Click 'OK'.

You should see the new account entry on the 'Accounts' dialog with a greenlight on the left hand side. If this is the case, click 'Close' to return to the main DragonDisk window.

Screenshot of new account entry on the 'Accounts' dialog with a greenlight on the left hand side.

Method 1: Copying data to the S3 bucket

Setting up the remote path: From the main DragonDisk interface click on the right hand side 'Root' dropdown and select the correct 'S3 bucket' (e.g. upload.4225.dlconsulting.com, the name of the bucket will be provided to you as part of your credentials).

Screenshot of right hand side root dropdown.

You should now see the contents of the bucket, which may only contain a *.details file.

Setting up the local path: You now need to know the path to the local folder (on your computer) where the data you intend to upload is stored. Use e.g. Windows File Explorer to determine the path to the correct local folder.

Screenshot of Windows File Explorer with sample local path highlighted in the address bar.

From the main DragonDisk interface click on the left hand side 'Root' dropdown and select the correct local drive (on your computer), then choose the correct folder containing the data to upload.

Select correct local drive: e.g. D:/

Screenshot of DragonDisk left pane, selecting local drive.

Select correct local folder: e.g. D:/Upload

Screenshot of left pane of DragonDisk with sample local folder selected.

Copy the data to the S3 bucket: Once you reach this point, you should be able to simply drag the data folder over to the right pane and the uploading process should start. You should see the upload progress in DragonDisk's bottom pane.

Screenshot of DragonDisk, dragging the sample data folder from the left pane to the right pane.

Method 2: Synchronising data to the S3 bucket

When uploading large amounts of data, using the synchronisation feature can prevent the need to restart a large upload from the beginning if the connection between your local computer and S3 is interrupted, potentially saving a lot of time.

Set up a synchronisation job: First you will need to create a folder to sync to on the S3 bucket. Create a folder in the S3 bucket by:

Right clicking in the right pane and selecting 'Create folder' ...

Screenshot showing right clicking in the right pane and selecting 'Create folder'.

... then naming the folder e.g. 'Batch1-2022-10-06'

Screenshot showing newly created folder in the right pane.

From the 'Synchronization' menu, click 'Manage Sync Jobs'.

Screenshot showing 'Manage Sync Jobs...' selected from the Synchronization menu.

From the Manage Sync Jobs dialog box, click 'Add', then in the 'Synchronization job' dialog:

- Enter a name for the job: e.g. Batch 1 upload

- Select a source folder on your local computer. e.g. D:/Upload/Sample-Data/Batch1-2022-10-06

- Select the target folder you created: e.g. s3://upload.4225.dlconsulting.com/Batch1-2022-10-06/

- Under the 'Options' tab -> ensure all options are unchecked.

- Optional: If you need to exclude certain files from being uploaded (e.g. archival TIFFs), you can accomplish that by adding an exclude filter (e.g. *.tif) via the 'Filters' tab -> 'Exclude' -> 'Add'.

- Click 'OK'.

- Click 'Close'.

Screenshot showing how to set up synchronization job.

Now to initiate the sync job, from the 'Synchronization menu' click the job you created, e.g. 'Batch 1 upload'.

Screenshot showing 'Batch 1 upload' selected from Synchronization menu.

A confirmation dialog will appear. Review the source and target paths and if they are correct, click 'Yes'.

Screenshot showing 'Sync folder' confirmation dialog box.

Your synchronisation job will now run, uploading the data you specified.

You can view progress in the bottom pane.

Screenshot showing synchronization progress in the bottom pane.

If the internet connection between your computer and S3 is interrupted during this process, you can re-run the sync job which will only re-upload data which hasn't already been uploaded, potentially saving a lot of time.

Once you are happy that your files are copied or sync'd, you can email your DL Consulting contact to let us know the transfer is complete, and we will continue with the process of dealing with the data at our end.