Synchronized Disk Interleaving
IEEE Transactions on Computers
A case for redundant arrays of inexpensive disks (RAID)
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Comparison of sparing alternatives for disk arrays
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
EVENODD: an optimal scheme for tolerating double disk failures in RAID architectures
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Striping in a RAID level 5 disk array
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
The HP AutoRAID hierarchical storage system
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Pilot: an operating system for a personal computer
Communications of the ACM
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Unifying File System Protection
Proceedings of the General Track: 2002 USENIX Annual Technical Conference
Detection of Defective Media in Disks
Proceedings of the IEEE International Workshop on Defect and Fault Tolerance in VLSI Systems
Commercial Fault Tolerance: A Tale of Two Systems
IEEE Transactions on Dependable and Secure Computing
Disk Scrubbing in Large Archival Storage Systems
MASCOTS '04 Proceedings of the The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Measuring Real-World Data Availability
LISA '01 Proceedings of the 15th USENIX conference on System administration
Proceedings of the twentieth ACM symposium on Operating systems principles
Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Ensuring data integrity in storage: techniques and applications
Proceedings of the 2005 ACM workshop on Storage security and survivability
Model Checking An Entire Linux Distribution for Security Violations
ACSAC '05 Proceedings of the 21st Annual Computer Security Applications Conference
Using model checking to find serious file system errors
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Enhanced Reliability Modeling of RAID Storage Systems
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
An analysis of latent sector errors in disk drives
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
EXPLODE: a lightweight, general system for finding serious storage system errors
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Towards availability benchmarks: a case study of software raid systems
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
An analysis of data corruption in the storage stack
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Undetected disk errors in RAID arrays
IBM Journal of Research and Development
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
DARC: design and evaluation of an I/O controller for data protection
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Keeping bits safe: how hard can it be?
Communications of the ACM
End-to-end data integrity for file systems: a ZFS case study
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Tolerating file-system mistakes with EnvyFS
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
HotStorage'10 Proceedings of the 2nd USENIX conference on Hot topics in storage and file systems
Keeping Bits Safe: How Hard Can It Be?
Queue - Storage
Remote data checking for network coding-based distributed storage systems
Proceedings of the 2010 ACM workshop on Cloud computing security workshop
Warding off the dangers of data corruption with amulet
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Towards reliable storage systems
Towards reliable storage systems
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Erasure coding in windows azure storage
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Systems research and innovation in data ONTAP
ACM SIGOPS Operating Systems Review
ViewBox: integrating local file systems with cloud storage services
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.02 |
RAID storage systems protect data from storage errors, such as data corruption, using a set of one or more integrity techniques, such as checksums. The exact protection offered by certain techniques or a combination of techniques is sometimes unclear. We introduce and apply a formal method of analyzing the design of data protection strategies. Specifically, we use model checking to evaluate whether common protection techniques used in parity-based RAID systems are sufficient in light of the increasingly complex failure modes of modern disk drives. We evaluate the approaches taken by a number of real systems under single-error conditions, and find flaws in every scheme. In particular, we identify a parity pollution problem that spreads corrupt data (the result of a single error) across multiple disks, thus leading to data loss or corruption. We further identify which protection measures must be used to avoid such problems. Finally, we show how to combine real-world failure data with the results from the model checker to estimate the actual likelihood of data loss of different protection strategies.