A case for redundant arrays of inexpensive disks (RAID)
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Designing disk arrays for high data reliability
Journal of Parallel and Distributed Computing - Special issue on parallel I/O systems
Reliability analysis of redundant arrays of inexpensive disks
Journal of Parallel and Distributed Computing - Special issue on parallel I/O systems
EVENODD: an optimal scheme for tolerating double disk failures in RAID architectures
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Recurrent Events Data Analysis for Product Repairs, Disease Recurrences, and Other Applications
Recurrent Events Data Analysis for Product Repairs, Disease Recurrences, and Other Applications
Reliability for Networked Storage Nodes
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
An analysis of latent sector errors in disk drives
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Failure trends in a large disk drive population
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Hard Disk Drives: The Good, the Bad and the Ugly!
Queue - File Systems and Storage
An analysis of data corruption in the storage stack
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
A Highly Accurate Method for Assessing Reliability of Redundant Arrays of Inexpensive Disks (RAID)
IEEE Transactions on Computers
Higher reliability redundant disk arrays: Organization, operation, and coding
ACM Transactions on Storage (TOS)
Evaluating the Impact of Irrecoverable Read Errors on Disk Array Reliability
PRDC '09 Proceedings of the 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing
Mean time to meaningless: MTTDL, Markov models, and storage system reliability
HotStorage'10 Proceedings of the 2nd USENIX conference on Hot topics in storage and file systems
NAS '10 Proceedings of the 2010 IEEE Fifth International Conference on Networking, Architecture, and Storage
Row-diagonal parity for double disk failure correction
FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
Hi-index | 0.00 |
We introduce a new closed-form equation for estimating the number of data-loss events for a redundant array of inexpensive disks in a RAID-6 configuration. The equation expresses operational failures, their restorations, latent (sector) defects, and disk media scrubbing by time-based distributions that can represent non-homogeneous Poisson processes. It uses two-parameter Weibull distributions that allows the distributions to take on many different shapes, modeling increasing, decreasing, or constant occurrence rates. This article focuses on the statistical basis of the equation. It also presents time-based distributions of the four processes based on an extensive analysis of field data collected over several years from 10,000s of commercially available systems with 100,000s of disk drives. Our results for RAID-6 groups of size 16 indicate that the closed-form expression yields much more accurate results compared to the MTTDL reliability equation and matching computationally-intensive Monte Carlo simulations.