ACM Transactions on Computer Systems (TOCS)
A case for redundant arrays of inexpensive disks (RAID)
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Parity striping of disc arrays: low-cost reliable storage with acceptable throughput
Proceedings of the sixteenth international conference on Very large databases
Performance analysis of disk arrays under failure
Proceedings of the sixteenth international conference on Very large databases
Specifying data availability in multi-device file systems
ACM SIGOPS Operating Systems Review
High-Availability Computer Systems
Computer
Non-volatile memory for fast, reliable file systems
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Parity declustering for continuous operation in redundant disk arrays
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The design and implementation of a log-structured file system
ACM Transactions on Computer Systems (TOCS)
Designing disk arrays for high data reliability
Journal of Parallel and Distributed Computing - Special issue on parallel I/O systems
Parity logging overcoming the small write problem in redundant disk arrays
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The architecture of a fault-tolerant cached RAID controller
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Scheduling algorithms for modern disk drives
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
EVENODD: an optimal scheme for tolerating double disk failures in RAID architectures
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The Zebra striped network file system
ACM Transactions on Computer Systems (TOCS)
The HP AutoRAID hierarchical storage system
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
A trace-driven analysis of the UNIX 4.2 BSD file system
Proceedings of the tenth ACM symposium on Operating systems principles
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
Strategic directions in storage I/O issues in large-scale computing
ACM Computing Surveys (CSUR) - Special ACM 50th-anniversary issue: strategic directions in computing research
Tolerating multiple failures in RAID architectures with optimal storage and uniform declustering
Proceedings of the 24th annual international symposium on Computer architecture
RAPID-Cache-A Reliable and Inexpensive Write Cache for High Performance Storage Systems
IEEE Transactions on Parallel and Distributed Systems
Bridging the Information Gap in Storage Protocol Stacks
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Improving storage system availability with D-GRAID
ACM Transactions on Storage (TOS)
Myriad: Cost-effective Disaster Tolerance
FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Awarded Best Paper! - Using MEMS-Based Storage in Disk Arrays
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Hibernator: helping disk arrays sleep through the winter
Proceedings of the twentieth ACM symposium on Operating systems principles
Awarded Best Student Paper! -- Improving Storage System Availability with D-GRAID
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Increasing the capacity of RAID5 by online gradual assimilation
SNAPI '04 Proceedings of the international workshop on Storage network architecture and parallel I/Os
Reliability tradeoffs in personal storage systems
ACM SIGOPS Operating Systems Review
Total recall: system support for automated availability management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Matrix-Stripe-Cache-Based Contiguity Transform for Fragmented Writes in RAID-5
IEEE Transactions on Computers
Handling Persistent States in Process Checkpoint/Restart Mechanisms for HPC Systems
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
DiskReduce: RAID for data-intensive scalable computing
Proceedings of the 4th Annual Workshop on Petascale Data Storage
ATTEST: ATTributes-based Extendable STorage
Journal of Systems and Software
Online availability upgrades for parity-based RAIDs through supplementary parity augmentations
ACM Transactions on Storage (TOS)
Myriad: cost-effective disaster tolerance
FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
Using MEMS-based storage in disk arrays
FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
Improving storage system availability with D-GRAID
FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
HPDA: A hybrid parity-based disk array for enhanced performance and reliability
ACM Transactions on Storage (TOS)
Gecko: contention-oblivious disk arrays for cloud storage
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
Disk arrays are commonly designed to ensure that stored data will always be able to withstand a disk failure, but meeting this goal comes at a significant cost in performance. We show that this is unnecessary. By trading away a fraction of the enormous reliability provided by disk arrays, it is possible to achieve performance that is almost as good as a non-parity-protected set of disks. In particular, our AFRAID design eliminates the small-update penalty that plagues traditional RAID 5 disk arrays. It does this by applying the data update immediately, but delaying the parity update to the next quiet period between bursts of client activity. That is, AFRAID makes sure that the array is frequently redundant, even if it isn't always so. By regulating the parity update policy, AFRAID allows a smooth trade-off between performance and availability. Under real-life workloads, the AFRAID design can provide close to the full performance of an array of unprotected disks, and data availability comparable to a traditional RAID 5. Our results show that AFRAID offers 42% better performance for only 10% less availability, 97% better for 23% less, and as much as a factor of 4.1 times better performance for giving up less than half RAID 5's availability. We explore here the detailed availability and performance implications of the AFRAID approach.