DuraCloud™ @TDL (currently) offers four storage options for members, in any combination — two in the Amazon Cloud, Glacier and S3, and two at the Texas Advanced Computing Center (TACC), TDL’s Chronopolis and Digital Preservation Network (DPN) nodes. Copies of content ingested into a DuraCloud™ @TDL space can be stored in any combination of Amazon S3, Amazon Glacier, and the two cultural heritage nodes at TACC.
Cultural Heritage Options
Amazon S3 (Simple Storage Services) is secure, durable, highly-scalable object storage in the Amazon cloud. Amazon S3 is “high-availability” storage, meaning that content stored here can be retrieved with relative ease.
The Amazon S3 service redundantly stores data in multiple facilities and on multiple devices within each facility in the same region. To increase durability, Amazon S3 synchronously stores your data across multiple facilities before returning SUCCESS. In addition, Amazon S3 calculates checksums on all network traffic to detect corruption of data packets when storing or retrieving data. Amazon S3 performs regular, systematic data integrity checks and is built to be automatically self-healing.
Amazon Glacier is a low-cost cloud archive storage service that provides secure and durable storage for data archiving and online backup. In order to keep costs low, Amazon Glacier is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable. The Texas Digital Library utilizes AWS Glacier spaces located within the continental United States.
The Glacier service redundantly stores data in multiple facilities and on multiple devices within each facility in the same region. To increase durability, Amazon Glacier synchronously stores your data across multiple facilities before returning SUCCESS on uploading archives. Glacier performs regular, systematic data integrity checks and is built to be automatically self-healing.
Ingestion Modes and Resources
Members may manage ingestion and retrieval of items from the DurAdmin administrative interface or through the DuraSync tool.
- DurAdmin: DuraCloud™ supports uploading files through file selection or drag-and-drop via the web-based DuraCloud™ administrative interface. However, this method requires you to initiate the upload for each file or set of files every time you would like to update them. Also, this web-based administrator does not allow you to upload whole directories at a time.
- DuraCloud Sync: This application allows you to continuously copy files from any number of local folders to a DuraCloud™ space. As you add, update, and delete files locally these changes will be automatically propagated to the cloud. You can use the tool in two different modes: GUI mode or via a command-line interface.
Instructions for setting up and using these tools are available HERE.
DuraSync Command-line mode
Command-line mode is useful for those users running in a server environment, for those who want to run the Sync Tool in scripts, or for those who simply prefer a command line interface.
Special Considerations for Ingestion of Large Files: Chunker and Stitcher Tools in DuraCloud
Large Files may be broken up into “chunks” as they are ingested into DuraCloud™ @TDL. As these files are retrieved, they will automatically be “stitched” back together. Checksums and other resources will ensure that files retain their bit integrity.
DuraCloud Retrieval Tool (command-line tool)
The Retrieval Tool is a utility which is used to transfer (or “retrieve”) digital content from DuraCloud™ @TDL to your local file system.
|Instructions for setting up and using the DuraCloud™ @TDL ingestion
and retrieval tools mentioned above are available HERE.