Practical scrubbing: Getting to the bad sector at the right time

Authors:
George Amvrosiadis;Alina Oprea;Bianca Schroeder
Affiliations:
Dept. of Computer Science, University of Toronto, Canada;RSA Laboratories, Cambridge, MA, USA;Dept. of Computer Science, University of Toronto, Canada
Venue:
DSN '12 Proceedings of the 2012 42nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
Year:
2012

Citing 0
Cited 3

Sector-Disk (SD) Erasure Codes for Mixed Failure Modes in RAID Systems

ACM Transactions on Storage (TOS)
Efficient online memory error assessment and circumvention for Linux with RAMpage

International Journal of Critical Computer-Based Systems
SD codes: erasure codes designed for how storage systems really fail

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Latent sector errors (LSEs) are a common hard disk failure mode, where disk sectors become inaccessible while the rest of the disk remains unaffected. To protect against LSEs, commercial storage systems use scrubbers: background processes verifying disk data. The efficiency of different scrubbing algorithms in detecting LSEs has been studied in depth; however, no attempts have been made to evaluate or mitigate the impact of scrubbing on application performance. We provide the first known evaluation of the performance impact of different scrubbing policies in implementation, including guidelines on implementing a scrubber. To lessen this impact, we present an approach giving conclusive answers to the questions: when should scrubbing requests be issued, and at what size, to minimize impact and maximize scrubbing throughput for a given workload. Our approach achieves six times more throughput, and up to three orders of magnitude less slowdown than the default Linux I/O scheduler.