Practical scrubbing: Getting to the bad sector at the right time

  • Authors:
  • George Amvrosiadis;Alina Oprea;Bianca Schroeder

  • Affiliations:
  • Dept. of Computer Science, University of Toronto, Canada;RSA Laboratories, Cambridge, MA, USA;Dept. of Computer Science, University of Toronto, Canada

  • Venue:
  • DSN '12 Proceedings of the 2012 42nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Latent sector errors (LSEs) are a common hard disk failure mode, where disk sectors become inaccessible while the rest of the disk remains unaffected. To protect against LSEs, commercial storage systems use scrubbers: background processes verifying disk data. The efficiency of different scrubbing algorithms in detecting LSEs has been studied in depth; however, no attempts have been made to evaluate or mitigate the impact of scrubbing on application performance. We provide the first known evaluation of the performance impact of different scrubbing policies in implementation, including guidelines on implementing a scrubber. To lessen this impact, we present an approach giving conclusive answers to the questions: when should scrubbing requests be issued, and at what size, to minimize impact and maximize scrubbing throughput for a given workload. Our approach achieves six times more throughput, and up to three orders of magnitude less slowdown than the default Linux I/O scheduler.