Content Preparation for Preservation

Before you begin, use this tool to inventory and assess your content for preservation:

The Texas Digital Library has not instituted requirements for content preparation or packaging before ingestion of that content into DuraCloud™ @TDL and into any of our storage options. However, members should consider the following as they prepare to use the service.

  • Think about documentation that describes the content (submission documentation) and keep a record of what you have ingested (i.e. a manifest).
    • Think about and include information that you (or others) will need to know in order to find any given resource in DuraCloud™ @TDL down the road.
    • Put a copy of the manifest or other submission documentation with the content in DuraCloud™ @TDL and keep a copy outside the system.
  • Think about metadata.
    • Know about the technical metadata generated by DuraCloud™ @TDL upon ingestion of files.
    • For other technical metadata, many options exist for extracting such information from files, including FITS and MediaInfo. If you are working with content currently in DSpace, an export of collection metadata from DSpace will contain some technical metadata for the files in the collection.
  • Think about packaging content for upload to Digital Preservation Storage.
    • Members may wish to simply upload “loose” items to DuraCloud™ @TDL using either the DurAdmin user interface or DuraSync, and rely on manifests and unique filenames to identify resources within DuraCloud Spaces.
    • Other members may wish to package content into groups of files using any number of methods, including those listed below. DuraCloud will work with any of these methods.
    • ZIP files: .ZIP is a package file format that supports lossless data compression. A .ZIP file may contain one or more files or folders that may have been compressed. 
    • TAR files: .TAR is a type of package bitstream often used to collect many files into one larger file, while preserving file system information such as dates and directory structures. (More here.)
    • BagIt: “Bagger” is a desktop software tool developed at the Library of Congress that packages files into a specific directory structure, called a bag, for transfer and digital preservation.  In general a bag is a group of digital files of any type (metadata, media, text, etc.).  It can be an arbitrary grouping and does not have to be in a specific file/folder structure.  A Bagit bag is also a group of digital files of any type.  BagIt creates a directory structure internal to each bag, and generates a manifest of what is in the bag.

Quick Links

- DuraCloud™ @TDL Home
- Storage and Ingestion Options
- Member Allowance and costs
- Metadata in DuraCloud
- Resource Library

Questions about setting up and using DuraCloud™ @TDL? Contact the TDL Helpdesk.

DuraCloud™ is open source software developed by DuraSpace.DuraSpace logo