Pergamum: replacing tape with energy efficient, reliable, disk-based archival storage
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Disk drive workload captured in logs collected during the field return incoming test
WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Disk Scrubbing Versus Intradisk Redundancy for RAID Storage Systems
ACM Transactions on Storage (TOS)
Beyond MTTDL: A Closed-Form RAID 6 Reliability Equation
ACM Transactions on Storage (TOS)
Hi-index | 0.00 |
Today's data storage systems are increasingly adopting low-cost disk drives that have higher capacity but lower reliability, leading to more frequent rebuilds and to a higher risk of unrecoverable media errors. We propose a new XOR-based intra-disk redundancy scheme, called interleaved parity check (IPC), to enhance the reliability of RAID systems that incurs only negligible I/O performance degradation. The proposed scheme introduces an additional level of redundancy inside each disk, on top of the RAID redundancy across multiple disks. The RAID parity provides protection against disk failures, while the proposed scheme aims to protect against media-related unrecoverable errors.We develop a new model capturing the effect of correlated unrecoverable sector errors and subsequently use it to analyze the proposed scheme as well as the traditional redundancy schemes based on Reed-Solomon (RS) codes and single-parity-check (SPC) codes. We derive closed-form expressions for the mean time to data loss (MTTDL) of RAID 5 and RAID 6 systems in the presence of unrecoverable errors and disk failures. We then combine these results for a comprehensive characterization of the reliability of RAID systems that incorporate the proposed IPC redundancy scheme. Our results show that in the practical case of correlated errors, the proposed scheme provides the same reliability as the optimum albeit more complex RS coding scheme. Finally, the throughput performance of incorporating the intra-disk redundancy on various RAID systems is evaluated by means of event-driven simulations. A detailed description of these contributions is given in [1].