Proactive replication in distributed storage systems using machine availability estimation

Authors:
Alessandro Duminuco;Ernst Biersack;Taoufik En-Najjary
Affiliations:
Institut Eurecom, Sophia Antipolis, France;Institut Eurecom, Sophia Antipolis, France;Institut Eurecom, Sophia Antipolis, France
Venue:
CoNEXT '07 Proceedings of the 2007 ACM CoNEXT conference
Year:
2007

Citing 14
Cited 12

OceanStore: an architecture for global-scale persistent storage

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Probability and statistics with reliability, queuing and computer science applications

Probability and statistics with reliability, queuing and computer science applications
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
PAST: A Large-Scale, Persistent Peer-to-Peer Storage Utility

HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
Gossip-Based Computation of Aggregate Information

FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Farsite: federated, available, and reliable storage for an incompletely trusted environment

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
OpenDHT: a public DHT service and its uses

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Gossip-based aggregation in large dynamic networks

ACM Transactions on Computer Systems (TOCS)
Data durability in peer to peer storage systems

CCGRID '04 Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid
Internet-Scale Storage Systems under Churn -- A Study of the Steady-State using Markov Models

P2P '06 Proceedings of the Sixth IEEE International Conference on Peer-to-Peer Computing
Total recall: system support for automated availability management

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Glacier: highly durable, decentralized storage despite massive correlated failures

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Efficient replica maintenance for distributed storage systems

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
High availability in DHTs: erasure coding vs. replication

IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems

Selfish Neighbor Selection in Peer-to-Peer Backup and Storage Applications

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Finding Good Partners in Availability-Aware P2P Networks

SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
Optimizing peer-to-peer backup using lifetime estimations

Proceedings of the 2009 EDBT/ICDT Workshops
Maintaining data reliability without availability in P2P storage systems

Proceedings of the 2010 ACM Symposium on Applied Computing
A quantitative analysis of redundancy schemes for peer-to- peer storage systems

SSS'10 Proceedings of the 12th international conference on Stabilization, safety, and security of distributed systems
Towards the design of optimal data redundancy schemes for heterogeneous cloud storage infrastructures

Computer Networks: The International Journal of Computer and Telecommunications Networking
Reducing Repair Traffic in P2P Backup Systems: Exact Regenerating Codes on Hierarchical Codes

ACM Transactions on Storage (TOS)
Evaluation of p2p systems under different churn models: why we should bother

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Contextual Trust Aided Enhancement of Data Availability in Peer-to-Peer Backup Storage Systems

Journal of Network and Systems Management
P2P and cloud: a marriage of convenience for replica management

IWSOS'12 Proceedings of the 6th IFIP TC 6 international conference on Self-Organizing Systems
Choosing partners based on availability in P2P networks

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
On the interplay between data redundancy and retrieval times in P2P storage systems

Computer Networks: The International Journal of Computer and Telecommunications Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed storage systems provide data availability by means of redundancy. To assure a given level of availability in case of node failures, new redundant fragments need to be introduced. Since node failures can be either transient or permanent, deciding when to generate new fragments is non-trivial. An additional difficulty is due to the fact that the failure behavior in terms of the rate of permanent and transient failures may vary over time. To be able to adapt to changes in the failure behavior, many systems adopt a reactive approach, in which new fragments are created as soon as a failure is detected. However, reactive approaches tend to produce spikes in bandwidth consumption. Proactive approaches create new fragments at a fixed rate that depends on the knowledge of the failure behavior or is given by the system administrator. However, existing proactive systems are not able to adapt to a changing failure behavior, which is common in real world. We propose a new technique based on an ongoing estimation of the failure behavior that is obtained using a model that consists of a network of queues. This scheme combines the adaptiveness of reactive systems with the smooth bandwidth usage of proactive systems, generalizing the two previous approaches. Now, the duality reactive or proactive becomes a specific case of a wider approach tunable with respect to the dynamics of the failure behavior.