Performance analysis of disk arrays under failure
Proceedings of the sixteenth international conference on Very large databases
Comparison of sparing alternatives for disk arrays
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Parity declustering for continuous operation in redundant disk arrays
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Architectures and algorithms for on-line failure recovery in redundant disk arrays
Distributed and Parallel Databases - Special issue on disk arrays
A survey of partial difference sets
Designs, Codes and Cryptography
On-line data reconstruction in redundant disk arrays
On-line data reconstruction in redundant disk arrays
EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures
IEEE Transactions on Computers - Special issue on fault-tolerant computing
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Automatic Recovery from Disk Failure in Continuous-Media Servers
IEEE Transactions on Parallel and Distributed Systems
Analytic Modeling of Clustered RAID with Mapping Based on Nearly Random Permutation
IEEE Transactions on Computers
RAID5 Performance with Distributed Sparing
IEEE Transactions on Parallel and Distributed Systems
Reliability Mechanisms for Very Large Storage Systems
MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Evaluation of Distributed Recovery in Large-Scale Storage Systems
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
HoVer Erasure Codes For Disk Arrays
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
A fresh look at the reliability of long-term digital storage
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Matrix methods for lost data reconstruction in erasure codes
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
STAR: an efficient coding scheme for correcting triple storage node failures
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
WEAVER codes: highly fault tolerant erasure codes for storage systems
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Failure trends in a large disk drive population
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
RAIF: Redundant Array of Independent Filesystems
MSST '07 Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
WorkOut: I/O workload outsourcing for boosting RAID reconstruction performance
FAST '09 Proccedings of the 7th conference on File and storage technologies
X-code: MDS array codes with optimal encoding
IEEE Transactions on Information Theory
In search of I/O-optimal recovery from disk failures
HotStorage'11 Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems
A Hybrid Approach to Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation
ACM Transactions on Storage (TOS)
Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Rebuild processing in RAID5 with emphasis on the supplementary parity augmentation method[37]
ACM SIGARCH Computer Architecture News
Erasure coding in windows azure storage
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
HotStorage'13 Proceedings of the 5th USENIX conference on Hot Topics in Storage and File Systems
Sector-Disk (SD) Erasure Codes for Mixed Failure Modes in RAID Systems
ACM Transactions on Storage (TOS)
SD codes: erasure codes designed for how storage systems really fail
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
Modern storage systems use thousands of inexpensive disks to meet the storage requirement of applications. To enhance the data availability, some form of redundancy is used. For example, conventional RAID-5 systems provide data availability for single disk failure only, while recent advanced coding techniques such as row-diagonal parity (RDP) can provide data availability with up to two disk failures. To reduce the probability of data unavailability, whenever a single disk fails, disk recovery (or rebuild) will be carried out. We show that conventional recovery scheme of RDP code for a single disk failure is inefficient and suboptimal. In this paper, we propose an optimal and efficient disk recovery scheme, Row-Diagonal Optimal Recovery (RDOR), for single disk failure of RDP code that has the following properties: (1) it is read optimal in the sense that it issues the smallest number of disk reads to recover the failed disk; (2) it has the load balancing property that all surviving disks will be subjected to the same amount of additional workload in rebuilding the failed disk. We carefully explore the design state space and theoretically show the optimality of RDOR. We carry out performance evaluation to quantify the merits of RDOR on some widely used disks.