RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures
IEEE Transactions on Computers - Special issue on fault-tolerant computing
n-dimensional codes for detecting and correcting multiple errors0
Communications of the ACM
Three and Four-dimensional Parity-check Codes for Correction and Detection of Multiple Errors
ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
A Practical Analysis of Low-Density Parity-Check Erasure Codes for Wide-Area Storage Applications
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
A Decentralized Algorithm for Erasure-Coded Virtual Disks
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Efficient Byzantine-Tolerant Erasure-Coded Storage
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Assessing the Performance of Erasure Codes in the Wide-Area
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Small Parity-Check Erasure Codes " Exploration and Observations
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Using Erasure Codes Efficiently for Storage in a Distributed System
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
IEEE Transactions on Computers
Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
HoVer Erasure Codes For Disk Arrays
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
IBM intelligent Bricks project: petabytes and beyond
IBM Journal of Research and Development
Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications
NCA '06 Proceedings of the Fifth IEEE International Symposium on Network Computing and Applications
STAR: an efficient coding scheme for correcting triple storage node failures
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
WEAVER codes: highly fault tolerant erasure codes for storage systems
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Determining Fault Tolerance of XOR-Based Erasure Codes Efficiently
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
An analysis of latent sector errors in disk drives
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Failure trends in a large disk drive population
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
RobuSTore: a distributed storage architecture with robust and high performance
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
MDS array codes with independent parity symbols
IEEE Transactions on Information Theory
X-code: MDS array codes with optimal encoding
IEEE Transactions on Information Theory
Efficient erasure correcting codes
IEEE Transactions on Information Theory
In search of I/O-optimal recovery from disk failures
HotStorage'11 Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems
Generalized X-code: An efficient RAID-6 code for arbitrary size of disk array
ACM Transactions on Storage (TOS)
Hierarchical RAID: Design, performance, reliability, and recovery
Journal of Parallel and Distributed Computing
Sector-Disk (SD) Erasure Codes for Mixed Failure Modes in RAID Systems
ACM Transactions on Storage (TOS)
SD codes: erasure codes designed for how storage systems really fail
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
As storage systems grow in size and complexity, they are increasingly confronted with concurrent disk failures together with multiple unrecoverable sector errors. To ensure high data reliability and availability, erasure codes with high fault tolerance are required. In this article, we present a new family of erasure codes with high fault tolerance, named GRID codes. They are called such because they are a family of strip-based codes whose strips are arranged into multi-dimensional grids. In the construction of GRID codes, we first introduce a concept of matched codes and then discuss how to use matched codes to construct GRID codes. In addition, we propose an iterative reconstruction algorithm for GRID codes. We also discuss some important features of GRID codes. Finally, we compare GRID codes with several categories of existing codes. Our comparisons show that for large-scale storage systems, our GRID codes have attractive advantages over many existing erasure codes: (a) They are completely XOR-based and have very regular structures, ensuring easy implementation; (b) they can provide up to 15 and even higher fault tolerance; and (c) their storage efficiency can reach up to 80% and even higher. All the advantages make GRID codes more suitable for large-scale storage systems.