EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures
IEEE Transactions on Computers - Special issue on fault-tolerant computing
A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems
Software—Practice & Experience
Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
IEEE Transactions on Information Theory
X-code: MDS array codes with optimal encoding
IEEE Transactions on Information Theory
WEAVER codes: highly fault tolerant erasure codes for storage systems
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
An analysis of latent sector errors in disk drives
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An analysis of data corruption in the storage stack
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
SWEEPER: an efficient disaster recovery point identification mechanism
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
An analysis of data corruption in the storage stack
ACM Transactions on Storage (TOS)
A performance evaluation and examination of open-source erasure coding libraries for storage
FAST '09 Proccedings of the 7th conference on File and storage technologies
International Journal of High Performance Computing Applications
Higher reliability redundant disk arrays: Organization, operation, and coding
ACM Transactions on Storage (TOS)
Optimal recovery of single disk failure in RDP code storage systems
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A spin-up saved is energy earned: achieving power-efficient, erasure-coded storage
HotDep'08 Proceedings of the Fourth conference on Hot topics in system dependability
ACM Transactions on Storage (TOS)
In search of I/O-optimal recovery from disk failures
HotStorage'11 Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems
A Hybrid Approach to Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation
ACM Transactions on Storage (TOS)
Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
ACM SIGOPS Operating Systems Review
ACM Transactions on Storage (TOS)
Hi-index | 0.00 |
Erasures codes, particularly those protecting against multiple failures in RAID disk arrays, provide a code-specific means for reconstruction of lost (erased) data. In the RAID application this is modeled as loss of strips so that reconstruction algorithms are usually optimized to reconstruct entire strips; that is, they apply only to highly correlated sector failures, i.e., sequential sectors on a lost disk. In this paper we address two more general problems: (1) recovery of lost data due to scattered or uncorrelated erasures and (2) recovery of partial (but sequential) data from a single lost disk (in the presence of any number of failures). The latter case may arise in the context of host IO to a partial strip on a lost disk. The methodology we propose for both problems is completely general and can be applied to any erasure code, but is most suitable for XOR-based codes. For the scattered erasures, typically due to hard errors on the disk (or combinations of hard errors and disk loss), our methodology provides for one of two outcomes for the data on each lost sector. Either the lost data is declared unrecoverable (in the information-theoretic sense) or it is declared recoverable and a formula is provided for the reconstruction that depends only on readable sectors. In short, the methodology is both complete and constructive.