Tolerating multiple failures in RAID architectures with optimal storage and uniform declustering

  • Authors:
  • Guillermo A. Alvarez;Walter A. Burkhard;Flaviu Cristian

  • Affiliations:
  • Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA;Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA;Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA

  • Venue:
  • Proceedings of the 24th annual international symposium on Computer architecture
  • Year:
  • 1997

Quantified Score

Hi-index 0.01

Visualization

Abstract

We present DATUM, a novel method for tolerating multiple disk failures in disk arrays. DATUM is the first known method that can mask any given number of failures, requires an optimal amount of redundant storage space, and spreads reconstruction accesses uniformly over disks in the presence of failures without needing large layout tables in controller memory. Our approach is based on information dispersal, a coding technique that admits an efficient hardware implementation. As the method does not restrict the configuration parameters of the disk array, many existing RAID organizations are particular cases of DATUM. A detailed performance comparison with two other approaches shows that DATUM'S response times are similar to those of the best competitor when two or less disks fail, and that the performance degrades gracefully when more than two disks fail.