Selecting RAID levels for disk arrays

  • Authors:
  • Eric Anderson;Ram Swaminathan;Alistair Veitch;Guillermo A. Alvarez;John Wilkes

  • Affiliations:
  • Storage and Content Distribution Department, Hewlett-Packard Laboratories, Palo Alto, CA;Storage and Content Distribution Department, Hewlett-Packard Laboratories, Palo Alto, CA;Storage and Content Distribution Department, Hewlett-Packard Laboratories, Palo Alto, CA;Storage and Content Distribution Department, Hewlett-Packard Laboratories, Palo Alto, CA;Storage and Content Distribution Department, Hewlett-Packard Laboratories, Palo Alto, CA

  • Venue:
  • FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Disk arrays have a myriad of configuration parameters that interact in counter-intuitive ways, and those interactions can have significant impacts on cost, performance, and reliability. Even after values for these parameters have been chosen, there are exponentially-many ways to map data onto the disk arrays' logical units. Meanwhile, the importance of correct choices is increasing: storage systems represent an growing fraction of total system cost, they need to respond more rapidly to changing needs, and there is less and less tolerance for mistakes. We believe that automatic design and configuration of storage systems is the only viable solution to these issues. To that end, we present a comparative study of a range of techniques for programmatically choosing the RAID levels to use in a disk array. Our simplest approaches are modeled on existing, manual rules of thumb: they "tag" data with a RAID level before determining the configuration of the array to which it is assigned. Our best approach simultaneously determines the RAID levels for the data, the array configuration, and the layout of data on that array. It operates as an optimization process with the twin goals of minimizing array cost while ensuring that storage workload performance requirements will be met. This approach produces robust solutions with an average cost/performance 14- 17% better than the best results for the tagging schemes, and up to 150-200% better than their worst solutions. We believe that this is the first presentation and systematic analysis of a variety of novel, fully-automatic RAID-level selection techniques.