Separating durability and availability in self-managed storage

Authors:
Geoffrey Lefebvre;Michael J. Feeley
Affiliations:
University of British Columbia;University of British Columbia
Venue:
Proceedings of the 11th workshop on ACM SIGOPS European workshop
Year:
2004

Citing 8
Cited 1

OceanStore: an architecture for global-scale persistent storage

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Analysis of the evolution of peer-to-peer systems

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Erasure Coding Vs. Replication: A Quantitative Comparison

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Introspective Failure Analysis: Avoiding Correlated Failures in Peer-to-Peer Systems

SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
High availability, scalable storage, dynamic peer networks: pick two

HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Total recall: system support for automated availability management

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1

Exploring data reliability tradeoffs in replicated storage systems

Proceedings of the 18th ACM international symposium on High performance distributed computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Building reliable data storage from unreliable components presents many challenges and is of particular interest for peer-to-peer storage systems. Recent work has examined the trade-offs associated with ensuring data availability in such systems. Reliability, however, is more than just availability. In fact, the durability of data is typically of more paramount concern. While users are likely to tolerate occasional disconnection from their data (they will likely have no choice in the matter), they demand a much stronger guarantee that their data is never permanently lost due to failure. To deliver strong durability guarantees efficiently, however, requires decoupling durability from availability. This paper describes the design of a data redundancy scheme that guarantees durability independently from availability. We provide a formula for determining the rate of redundancy repair when durability is the only concern and show that availability requires much more frequent repair. We simulate modified versions of the Total Recall block store that incorporate our design. Our results show that we can deliver durability more cheaply than availability, reducing network overhead by between 50% and 97%.