Comparison of sparing alternatives for disk arrays
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Parity declustering for continuous operation in redundant disk arrays
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
Architectures and algorithms for on-line failure recovery in redundant disk arrays
Distributed and Parallel Databases - Special issue on disk arrays
A survey of partial difference sets
Designs, Codes and Cryptography
On-line data reconstruction in redundant disk arrays
On-line data reconstruction in redundant disk arrays
EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures
IEEE Transactions on Computers - Special issue on fault-tolerant computing
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Automatic Recovery from Disk Failure in Continuous-Media Servers
IEEE Transactions on Parallel and Distributed Systems
Analytic Modeling of Clustered RAID with Mapping Based on Nearly Random Permutation
IEEE Transactions on Computers
RAID5 Performance with Distributed Sparing
IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Disk Arrays under Failure
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Reliability Mechanisms for Very Large Storage Systems
MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Evaluation of Distributed Recovery in Large-Scale Storage Systems
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
A fresh look at the reliability of long-term digital storage
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Matrix methods for lost data reconstruction in erasure codes
FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Failure trends in a large disk drive population
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
RAIF: Redundant Array of Independent Filesystems
MSST '07 Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
WorkOut: I/O workload outsourcing for boosting RAID reconstruction performance
FAST '09 Proccedings of the 7th conference on File and storage technologies
A performance evaluation and examination of open-source erasure coding libraries for storage
FAST '09 Proccedings of the 7th conference on File and storage technologies
Optimal recovery of single disk failure in RDP code storage systems
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Network coding for distributed storage systems
IEEE Transactions on Information Theory
On the impact of flash SSDs on spatial indexing
Proceedings of the Sixth International Workshop on Data Management on New Hardware
Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs
MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
X-code: MDS array codes with optimal encoding
IEEE Transactions on Information Theory
NCCloud: applying network coding for the storage repair in a cloud-of-clouds
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
The current parallel storage systems use thousands of inexpensive disks to meet the storage requirement of applications. Data redundancy and/or coding are used to enhance data availability, for instance, Row-diagonal parity (RDP) and EVENODD codes, which are widely used in RAID-6 storage systems, provide data availability with up to two disk failures. To reduce the probability of data unavailability, whenever a single disk fails, disk recovery will be carried out. We find that the conventional recovery schemes of RDP and EVENODD codes for a single failed disk only use one parity disk. However, there are two parity disks in the system, and both can be used for single disk failure recovery. In this article, we propose a hybrid recovery approach that uses both parities for single disk failure recovery, and we design efficient recovery schemes for RDP code (RDOR-RDP) and EVENODD code (RDOR-EVENODD). Our recovery scheme has the following attractive properties: (1) “read optimality” in the sense that our scheme issues the smallest number of disk reads to recover a single failed disk and it reduces approximately 1/4 of disk reads compared with conventional schemes; (2) “load balancing property” in that all surviving disks will be subjected to the same (or almost the same) amount of additional workload in rebuilding the failed disk. We carry out performance evaluation to quantify the merits of RDOR-RDP and RDOR-EVENODD on some widely used disks with DiskSim. The offline experimental results show that RDOR-RDP and RDOR-EVENODD outperform the conventional recovery schemes of RDP and EVENODD codes in terms of total recovery time and recovery workload on individual surviving disk. However, the improvements are less than the theoretical value (approximately 25%), as RDOR-RDP and RDOR-EVENODD change the disk access pattern from purely sequential to a more random one compared with their conventional schemes.