SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Disk Scrubbing in Large Archival Storage Systems
MASCOTS '04 Proceedings of the The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Proceedings of the twentieth ACM symposium on Operating systems principles
Idletime scheduling with preemption intervals
Proceedings of the twentieth ACM symposium on Operating systems principles
Long-Range Dependence at the Disk Drive Level
QEST '06 Proceedings of the 3rd international conference on the Quantitative Evaluation of Systems
A fresh look at the reliability of long-term digital storage
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Enhanced Reliability Modeling of RAID Storage Systems
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
An analysis of latent sector errors in disk drives
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Efficient management of idleness in systems
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Disk drive level workload characterization
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Journaling versus soft updates: asynchronous meta-data protection in file systems
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Failure trends in a large disk drive population
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Understanding disk failure rates: What does an MTTF of 1,000,000 hours mean to you?
ACM Transactions on Storage (TOS)
Improving file system reliability with I/O shepherding
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Parity lost and parity regained
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
An analysis of data corruption in the storage stack
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Efficient management of idleness in storage systems
ACM Transactions on Storage (TOS)
Restrained utilization of idleness for transparent scheduling of background tasks
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
On the impact of disk scrubbing on energy savings
HotPower'08 Proceedings of the 2008 conference on Power aware computing and systems
Disk drive workload captured in logs collected during the field return incoming test
WASL'08 Proceedings of the First USENIX conference on Analysis of system logs
Online availability upgrades for parity-based RAIDs through supplementary parity augmentations
ACM Transactions on Storage (TOS)
Disk Scrubbing Versus Intradisk Redundancy for RAID Storage Systems
ACM Transactions on Storage (TOS)
Hi-index | 0.00 |
Despite a low occurrence rate, silent data corruption represents a growing concern for storage systems designers. Throughout the storage hierarchy, from the file system down to the disk drives, various solutions exist to avoid, detect, and correct silent data corruption. Undetected errors during the completion of WRITEs may cause silent data corruption. A portion of the WRITE errors may be detected and corrected successfully by verifying the data written on the disk with the data in the disk cache. Write verification traditionally is scheduled immediately after a WRITE completion (Read After Write - RAW) which is unattractive, because it degrades user performance. To reduce the performance penalty associated with RAW, we propose to retain the written content in the disk cache and verify it once the disk drive becomes idle. Although attractive, this approach (called IRAW - Idle Read After Write) contends for resources, i.e., cache and idle time, with user traffic and other background activities. In this paper, we present a trace-driven evaluation of IRAW and show its feasibility. Our analysis indicates that idleness is present in disk drives and can be utilized for WRITE verification with minimal effect on user performance. IRAW benefits significantly if some amount of cache, i.e., 1 or 2 MB, is dedicated to retain the unverified WRITEs. If the cache is shared with the user requests then a cache retention policy that places both READs and WRITEs upon completion at the most recently used cache segment, yields best IRAW performance without effecting user READs cache hit ratio and overall user performance.