A case for redundant arrays of inexpensive disks (RAID)
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Disk Scrubbing in Large Archival Storage Systems
MASCOTS '04 Proceedings of the The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
A fresh look at the reliability of long-term digital storage
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Enhanced Reliability Modeling of RAID Storage Systems
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
An analysis of latent sector errors in disk drives
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Disk drive level workload characterization
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Failure trends in a large disk drive population
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
ACM Transactions on Storage (TOS)
Efficient management of idleness in storage systems
ACM Transactions on Storage (TOS)
Restrained utilization of idleness for transparent scheduling of background tasks
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Higher reliability redundant disk arrays: Organization, operation, and coding
ACM Transactions on Storage (TOS)
Understanding latent sector errors and how to protect against them
ACM Transactions on Storage (TOS)
A clean-slate look at disk scrubbing
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Understanding latent sector errors and how to protect against them
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
On the impact of disk scrubbing on energy savings
HotPower'08 Proceedings of the 2008 conference on Power aware computing and systems
Online availability upgrades for parity-based RAIDs through supplementary parity augmentations
ACM Transactions on Storage (TOS)
Disk Scrubbing Versus Intradisk Redundancy for RAID Storage Systems
ACM Transactions on Storage (TOS)
Busy bee: how to use traffic information for better scheduling of background tasks
ICPE '12 Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering
Hi-index | 0.00 |
Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used disk scrubbing scheme, which operates by periodically accessing disk drives to detect media-related unrecoverable errors. These errors are subsequently corrected by rebuilding the sectors affected. The second scheme is the recently proposed intradisk redundancy scheme which uses a further level of redundancy inside each disk, in addition to the RAID redundancy across multiple disks. Analytic results are obtained assuming Poisson arrivals of random I/O requests. Our results demonstrate that the reliability improvement due to disk scrubbing depends on the scrubbing frequency and the workload of the system, and may not reach the reliability level achieved by a simple IPC-based intra-disk redundancy scheme, which is insensitive to the workload. In fact, the IPC-based intra-disk redundancy scheme achieves essentially the same reliability as that of a system operating without unrecoverable sector errors. For heavy workloads, the reliability achieved by the scrubbing scheme can be orders of magnitude less than that of the intra-disk redundancy scheme.