Providing High Reliability in a Minimum Redundancy Archival Storage System

  • Authors:
  • Deepavali Bhagwat;Kristal Pollack;Darrell D. E. Long;Thomas Schwarz;Ethan L. Miller;Jehan-Francois Paris

  • Affiliations:
  • University of California, Santa Cruz, USA;University of California, Santa Cruz, USA;University of California, Santa Cruz, USA;University of California, Santa Cruz, USA;University of California, Santa Cruz, USA;University of Houston, USA

  • Venue:
  • MASCOTS '06 Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Inter-file compression techniques store files as sets of references to data objects or chunks that can be shared among many files. While these techniques can achieve much better compression ratios than conventional intra-file compression methods such as Lempel-Ziv compression, they also reduce the reliability of the storage system because the loss of a few critical chunks can lead to the loss of many files. We show how to eliminate this problem by choosing for each chunk a replication level that is a function of the amount of data that would be lost if that chunk were lost. Experiments using actual archival data show that our technique can achieve significantly higher robustness than a conventional approach combining data mirroring and intra-file compression while requiring about half the storage space.