Hierarchical RAID: Design, performance, reliability, and recovery

  • Authors:
  • Alexander Thomasian;Yujie Tang;Yang Hu

  • Affiliations:
  • -;-;-

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hierarchical RAID (HRAID) extends the RAID paradigm to mask the failure of whole Storage Nodes (SNs) or bricks, where each SN is a disk array with a certain RAID level. HRAIDk/@? with N SNs and M disks per SN tolerates k SN failures and @? disk failures per SN with Maximum Distance Separable (MDS) erasure codes, which introduce the minimum level of redundancy at each level. For N=M there are k internode and @? intranode check strips per SN, occupying the capacity of as many disks with storage redundancy (k+@?)/N, but a higher storage redundancy is required for MN. HRAIDk/@? tolerates all disk failures up to d"m"i"n=(k+1)(@?+1)-1, but up to d"m"a"x=N@?+Mk-k@? disk failures can be tolerated. Three options for HRAID operation are: (I) Only intranode recovery. (II) Intranode and internode recovery on demand reconstruction of blocks and rebuild. (III) Multistep internode recovery with no rebuild processing. The I/Os Per Second (IOPS) metric is used to assess the cost of fault-tolerance for HRAIDk/@? against RAID(4+@?) and RAID0, for varying k and @?. The maximum IOPS is at its lowest in degraded mode, but even with fewer operational disks the normal mode IOPS may be exceeded after restriping. Asymptotic reliability analysis and simulation results show that HRAIDk/@? with @?k provides a higher reliability when SN failures are due to disk rather than controller failures. Monte Carlo simulation is used to quantify the effect of various recovery options with varying k and @? and as the SN controller failure rate is varied with respect to disk failure rates on the Mean Time to Data Loss (MTTDL). The HRAID paradigm is justified by the fact that Options II attains a significantly higher MTTDL than Option I. Option III with no rebuild processing has an MTTDL exceeding Option II, but a poorer performance.