MENU

Anatomy of the Transfer Package

DART takes the folders and files you upload to it and packages them as a bag, a container that adheres to the BagIt File Packaging Format. This page describes how the package is structured.

BagIt bags

The Transfer Package is a BagIt bag compressed as a .tar file. Bags conform to BagIt's "set of hierarchical file layout conventions for storage and transfer of arbitrary digital content" (from the BagIt specification Abstract).

The Transfer Package contains a copy of the target folder(s).

  • The original target folder is unaffected and remains in place.
  • The name of the transfer package is based on the value you supplied in the Package Name field in DART.

Unzip the .tar file to see how the bag is structured.

  • It comprises four "tag" files, plus a folder that contains the actual transfer

bag-info.txt

  • Captures data supplied by the user on the DART interface (see the DART data entry guidelines page).
  • Includes some auto-generated metadata, e.g. Package-Time.
  • Field names follow the conventions of the BagIt specification.

bagit.txt

Identifies the package as a bag, gives the version of the BagIt specification used, and the character encoding used for tag files.

  • SFU's DART Transfer Profiles use 1.0 as the default BagIt-Version value and UTF-8 for character encoding.

data folder

  • Contains the actual transfer files in their original directory structure.

manifest-sha256.txt

Lists all the files in the data folder, each with a checksum generated by the SHA256 algorithm (see below for more on checksums).

tagmanifest-sha256.txt

Lists all the tag files with their checksums.

Checksums

DART creates checksums for each file included in your transfer and records the values in the bag's manifest file. A checksum:

  • Is an alpha-numeric value calculated by an algorithm applied to the file's underlying bitstream (the string of 0s and 1s).
  • Functions as a kind of digital fingerprint: any change to the bitstream will result in a completely different value when the same algorithm is applied.

Following deposit, an archivist runs their own DART client to compares the files' pre-transfer checksums (stored in the manifest file) with checksum generated post-transfer.

  • If the values are different, something happened to the data (corruption or loss) during transmission.
  • In the event a checksum validation fail, the Archives will contact the producer and ask them to re-send.

Checksums also provide a check against the Archives itself inadvertently changing any data during the inspection of the files during the validation phase.

Last updated: June 12, 2024