AFRAID: a frequently redundant array of independent disks

Authors:
Stefan Savage;John Wilkes
Affiliations:
University of Washington, Seattle, WA;Hewlett-Packard Laboratories, Palo Alto, CA
Venue:
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Year:
1996

Citing 19
Cited 23

A fast file system for UNIX

ACM Transactions on Computer Systems (TOCS)
A case for redundant arrays of inexpensive disks (RAID)

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Parity striping of disc arrays: low-cost reliable storage with acceptable throughput

Proceedings of the sixteenth international conference on Very large databases
Performance analysis of disk arrays under failure

Proceedings of the sixteenth international conference on Very large databases
Specifying data availability in multi-device file systems

ACM SIGOPS Operating Systems Review
High-Availability Computer Systems

Computer
Non-volatile memory for fast, reliable file systems

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Parity declustering for continuous operation in redundant disk arrays

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The design and implementation of a log-structured file system

ACM Transactions on Computer Systems (TOCS)
Designing disk arrays for high data reliability

Journal of Parallel and Distributed Computing - Special issue on parallel I/O systems
Parity logging overcoming the small write problem in redundant disk arrays

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The architecture of a fault-tolerant cached RAID controller

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
An introduction to disk drive modeling

Computer
Scheduling algorithms for modern disk drives

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
EVENODD: an optimal scheme for tolerating double disk failures in RAID architectures

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The Zebra striped network file system

ACM Transactions on Computer Systems (TOCS)
The HP AutoRAID hierarchical storage system

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
A trace-driven analysis of the UNIX 4.2 BSD file system

Proceedings of the tenth ACM symposium on Operating systems principles
Idleness is not sloth

TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings

Strategic directions in storage I/O issues in large-scale computing

ACM Computing Surveys (CSUR) - Special ACM 50th-anniversary issue: strategic directions in computing research
Tolerating multiple failures in RAID architectures with optimal storage and uniform declustering

Proceedings of the 24th annual international symposium on Computer architecture
RAPID-Cache-A Reliable and Inexpensive Write Cache for High Performance Storage Systems

IEEE Transactions on Parallel and Distributed Systems
Bridging the Information Gap in Storage Protocol Stacks

ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Deconstructing storage arrays

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Improving storage system availability with D-GRAID

ACM Transactions on Storage (TOS)
Myriad: Cost-effective Disaster Tolerance

FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Awarded Best Paper! - Using MEMS-Based Storage in Disk Arrays

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Hibernator: helping disk arrays sleep through the winter

Proceedings of the twentieth ACM symposium on Operating systems principles
Awarded Best Student Paper! -- Improving Storage System Availability with D-GRAID

FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Increasing the capacity of RAID5 by online gradual assimilation

SNAPI '04 Proceedings of the international workshop on Storage network architecture and parallel I/Os
Reliability tradeoffs in personal storage systems

ACM SIGOPS Operating Systems Review
Total recall: system support for automated availability management

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Matrix-Stripe-Cache-Based Contiguity Transform for Fragmented Writes in RAID-5

IEEE Transactions on Computers
Handling Persistent States in Process Checkpoint/Restart Mechanisms for HPC Systems

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
DiskReduce: RAID for data-intensive scalable computing

Proceedings of the 4th Annual Workshop on Petascale Data Storage
ATTEST: ATTributes-based Extendable STorage

Journal of Systems and Software
Online availability upgrades for parity-based RAIDs through supplementary parity augmentations

ACM Transactions on Storage (TOS)
Myriad: cost-effective disaster tolerance

FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
Using MEMS-based storage in disk arrays

FAST'03 Proceedings of the 2nd USENIX conference on File and storage technologies
Improving storage system availability with D-GRAID

FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
HPDA: A hybrid parity-based disk array for enhanced performance and reliability

ACM Transactions on Storage (TOS)
Gecko: contention-oblivious disk arrays for cloud storage

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Disk arrays are commonly designed to ensure that stored data will always be able to withstand a disk failure, but meeting this goal comes at a significant cost in performance. We show that this is unnecessary. By trading away a fraction of the enormous reliability provided by disk arrays, it is possible to achieve performance that is almost as good as a non-parity-protected set of disks. In particular, our AFRAID design eliminates the small-update penalty that plagues traditional RAID 5 disk arrays. It does this by applying the data update immediately, but delaying the parity update to the next quiet period between bursts of client activity. That is, AFRAID makes sure that the array is frequently redundant, even if it isn't always so. By regulating the parity update policy, AFRAID allows a smooth trade-off between performance and availability. Under real-life workloads, the AFRAID design can provide close to the full performance of an array of unprotected disks, and data availability comparable to a traditional RAID 5. Our results show that AFRAID offers 42% better performance for only 10% less availability, 97% better for 23% less, and as much as a factor of 4.1 times better performance for giving up less than half RAID 5's availability. We explore here the detailed availability and performance implications of the AFRAID approach.