Parity declustering for continuous operation in redundant disk arrays
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
RAID-II: a high-bandwidth network file server
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Practical loss-resilient codes
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Effective erasure codes for reliable computer communication protocols
ACM SIGCOMM Computer Communication Review
A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems
Software—Practice & Experience
Lessons from Giant-Scale Services
IEEE Internet Computing
Erasure Coding Vs. Replication: A Quantitative Comparison
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Note: Correction to the 1997 tutorial on Reed–Solomon coding
Software—Practice & Experience - Research Articles
Matrix methods for lost data reconstruction in erasure codes
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
WEAVER codes: highly fault tolerant erasure codes for storage systems
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
REO: a generic RAID engine and optimizer
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
STAR: An Efficient Coding Scheme for Correcting Triple Storage Node Failures
IEEE Transactions on Computers
A performance evaluation and examination of open-source erasure coding libraries for storage
FAST '09 Proccedings of the 7th conference on File and storage technologies
International Journal of High Performance Computing Applications
DiskReduce: RAID for data-intensive scalable computing
Proceedings of the 4th Annual Workshop on Petascale Data Storage
Explicit construction of optimal exact regenerating codes for distributed storage
Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
Data warehousing and analytics infrastructure at facebook
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Optimal recovery of single disk failure in RDP code storage systems
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Network coding for distributed storage systems
IEEE Transactions on Information Theory
Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs
MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Availability in globally distributed storage systems
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Row-diagonal parity for double disk failure correction
FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
Improving storage system availability with D-GRAID
FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
In search of I/O-optimal recovery from disk failures
HotStorage'11 Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems
Windows Azure Storage: a highly available cloud storage service with strong consistency
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
High availability in DHTs: erasure coding vs. replication
IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems
MDS array codes with independent parity symbols
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
X-code: MDS array codes with optimal encoding
IEEE Transactions on Information Theory
Erasure coding in windows azure storage
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
ACM Transactions on Storage (TOS)
XORing elephants: novel erasure codes for big data
Proceedings of the VLDB Endowment
Regenerating codes: a system perspective
ACM SIGOPS Operating Systems Review
High performance & low latency in solid-state drives through redundancy
Proceedings of the 1st Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads
HotStorage'13 Proceedings of the 5th USENIX conference on Hot Topics in Storage and File Systems
Sector-Disk (SD) Erasure Codes for Mixed Failure Modes in RAID Systems
ACM Transactions on Storage (TOS)
SD codes: erasure codes designed for how storage systems really fail
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Screaming fast Galois field arithmetic using intel SIMD instructions
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
To reduce storage overhead, cloud file systems are transitioning from replication to erasure codes. This process has revealed new dimensions on which to evaluate the performance of different coding schemes: the amount of data used in recovery and when performing degraded reads. We present an algorithm that finds the optimal number of codeword symbols needed for recovery for any XOR-based erasure code and produces recovery schedules that use a minimum amount of data. We differentiate popular erasure codes based on this criterion and demonstrate that the differences improve I/O performance in practice for the large block sizes used in cloud file systems. Several cloud systems [15, 10] have adopted Reed-Solomon (RS) codes, because of their generality and their ability to tolerate larger numbers of failures. We define a new class of rotated Reed-Solomon codes that perform degraded reads more efficiently than all known codes, but otherwise inherit the reliability and performance properties of Reed-Solomon codes.