Content Preparation for Preservation

The Texas Digital Library has not instituted requirements for content preparation or packaging before ingestion of that content into DuraCloud™ @TDL. However, members should consider the following as they prepare to use the service.

1) Think about documentation, and keep a record of what you have ingested (i.e. a manifest).

a) Think about and include information that you (or others) will need to know in order to find any given resource in DuraCloud™ @TDL down the road.

b) Put a copy of the manifest with the content in DuraCloud™ @TDL and keep a copy outside the system.

c) There are many options for creating such a manifest, but one common way is to simply generate it  manually in Excel.

2) Think about metadata.

a) Know about the technical metadata generated by DuraCloud™ @TDL upon ingestion of files.

b) For other technical metadata, many options exist for extracting such information from files, including FITS and MediaInfo. If you are working with content currently in DSpace, an export of collection metadata from DSpace will contain some technical metadata for the files in the collection.

3) Think about packaging content for upload to DuraCloud.

Members may wish to simply upload “loose” items to DuraCloud™ @TDL using either the DurAdmin user interface or DuraSync, and rely on manifests and unique filenames to identify resources within DuraCloud Spaces.

Other members may wish to package content into groups of files using any number of methods, including those listed below. DuraCloud will work with any of these methods.

  • ZIP files: .ZIP is an archive file format that supports lossless data compression. A .ZIP file may contain one or more files or folders that may have been compressed. 
  • TAR files: .TAR is a type of archive bitstream often used to collect many files into one larger file, while preserving file system information such as dates and directory structures. (More here.)
  • BagIt: “Bagger” is a desktop software tool developed at the Library of Congress that packages files into a specific directory structure, called a bag, for transfer and digital preservation.  In general a bag is a group of digital files of any type (metadata, media, text, etc.).  It can be an arbitrary grouping and does not have to be in a specific file/folder structure.  A Bagit bag is also a group of digital files of any type.  BagIt creates a directory structure internal to each bag, and generates a manifest of what is in the bag. To learn more about Bagit, please download this document.

Quick Links

- DuraCloud™ @TDL Home
- Storage and Ingestion Options
- Member Allowance and costs
- Metadata in DuraCloud
- Resource Library

Questions about setting up and using DuraCloud™ @TDL? Contact the TDL Helpdesk.

DuraCloud™ is open source software developed by DuraSpace.DuraSpace logo