Performance analysis of disk arrays under failure
Proceedings of the sixteenth international conference on Very large databases
Redundant disk arrays: reliable, parallel secondary storage
Redundant disk arrays: reliable, parallel secondary storage
Comparison of sparing alternatives for disk arrays
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
The TickerTAIP parallel RAID architecture
ACM Transactions on Computer Systems (TOCS)
Crosshatch disk array for improved reliability and performance
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Performability modelling tools and techniques
Performance Evaluation
An analysis of schedules for performing multi-page requests
Information Systems
Concurrency control: methods, performance, and analysis
ACM Computing Surveys (CSUR)
Efficient striping techniques for variable bit rate continuous media file servers
Performance Evaluation
Reliability and performance of hierarchical RAID with multiple controllers
Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Probability and statistics with reliability, queuing and computer science applications
Probability and statistics with reliability, queuing and computer science applications
Database Management Systems
RAID5 Performance with Distributed Sparing
IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of RAID5 Disk Arrays with a Vacationing Server Model for Rebuild Mode Operation
Proceedings of the Tenth International Conference on Data Engineering
Highly Concurrent Shared Storage
ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems
ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems
Theory, Volume 1, Queueing Systems
Theory, Volume 1, Queueing Systems
Awarded Best Paper! -- Row-Diagonal Parity for Double Disk Failure Correction
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Multi-level RAID for very large disk arrays
ACM SIGMETRICS Performance Evaluation Review - Design, implementation, and performance of storage systems
IBM intelligent Bricks project: petabytes and beyond
IBM Journal of Research and Development
Reliability of modular mesh-connected intelligent storage brick systems
IBM Journal of Research and Development
Clustered RAID Arrays and Their Access Costs
The Computer Journal
Analysis of Rebuild Processing in RAID5 Disk Arrays
The Computer Journal
Performance of Two-Disk Failure-Tolerant Disk Arrays
IEEE Transactions on Computers
Understanding disk failure rates: What does an MTTF of 1,000,000 hours mean to you?
ACM Transactions on Storage (TOS)
ACM Transactions on Storage (TOS)
GRID codes: Strip-based erasure codes with high fault tolerance for storage systems
ACM Transactions on Storage (TOS)
Memory Systems: Cache, DRAM, Disk
Memory Systems: Cache, DRAM, Disk
Higher reliability redundant disk arrays: Organization, operation, and coding
ACM Transactions on Storage (TOS)
Understanding latent sector errors and how to protect against them
ACM Transactions on Storage (TOS)
IEEE Transactions on Computers
Reliability for Networked Storage Nodes
IEEE Transactions on Dependable and Secure Computing
Disk Scrubbing Versus Intradisk Redundancy for RAID Storage Systems
ACM Transactions on Storage (TOS)
Performance, Reliability, and Performability Aspects of Hierarchical RAID
NAS '11 Proceedings of the 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage
Comment on "DACO: A High Performance Disk Architecture”
IEEE Transactions on Computers
X-code: MDS array codes with optimal encoding
IEEE Transactions on Information Theory
Hi-index | 0.00 |
Hierarchical RAID (HRAID) extends the RAID paradigm to mask the failure of whole Storage Nodes (SNs) or bricks, where each SN is a disk array with a certain RAID level. HRAIDk/@? with N SNs and M disks per SN tolerates k SN failures and @? disk failures per SN with Maximum Distance Separable (MDS) erasure codes, which introduce the minimum level of redundancy at each level. For N=M there are k internode and @? intranode check strips per SN, occupying the capacity of as many disks with storage redundancy (k+@?)/N, but a higher storage redundancy is required for MN. HRAIDk/@? tolerates all disk failures up to d"m"i"n=(k+1)(@?+1)-1, but up to d"m"a"x=N@?+Mk-k@? disk failures can be tolerated. Three options for HRAID operation are: (I) Only intranode recovery. (II) Intranode and internode recovery on demand reconstruction of blocks and rebuild. (III) Multistep internode recovery with no rebuild processing. The I/Os Per Second (IOPS) metric is used to assess the cost of fault-tolerance for HRAIDk/@? against RAID(4+@?) and RAID0, for varying k and @?. The maximum IOPS is at its lowest in degraded mode, but even with fewer operational disks the normal mode IOPS may be exceeded after restriping. Asymptotic reliability analysis and simulation results show that HRAIDk/@? with @?k provides a higher reliability when SN failures are due to disk rather than controller failures. Monte Carlo simulation is used to quantify the effect of various recovery options with varying k and @? and as the SN controller failure rate is varied with respect to disk failure rates on the Mean Time to Data Loss (MTTDL). The HRAID paradigm is justified by the fact that Options II attains a significantly higher MTTDL than Option I. Option III with no rebuild processing has an MTTDL exceeding Option II, but a poorer performance.