Minimizing churn in distributed systems

Authors:
P. Brighten Godfrey;Scott Shenker;Ion Stoica
Affiliations:
UC Berkeley;UC Berkeley;UC Berkeley
Venue:
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
Year:
2006

Citing 26
Cited 53

An approximate analysis of the LRU and FIFO buffer replacement schemes

SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Birthday paradox, coupon collectors, caching algorithms and self-organizing search

Discrete Applied Mathematics
Some Distribution-Free Aspects of Paging Algorithm Performance

Journal of the ACM (JACM)
Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Internet indirection infrastructure

Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Novel architectures for P2P applications: the continuous-discrete approach

Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
The impact of DHT routing geometry on resilience and proximity

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Know thy neighbor's neighbor: the power of lookahead in randomized P2P networks

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
The feasibility of supporting large-scale live streaming applications with dynamic application end-points

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
On lifetime-based node failure and stochastic resilience of decentralized peer-to-peer networks

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Predicting node availability in peer-to-peer networks

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Meridian: a lightweight network location service without virtual coordinates

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Towards a global IP anycast service

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Awarded Best Student Paper! - Pond: The OceanStore Prototype

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Handling churn in a DHT

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
High availability, scalable storage, dynamic peer networks: pick two

HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Operating system support for planetary-scale network services

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Total recall: system support for automated availability management

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Bandwidth-efficient management of DHT routing tables

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Symphony: distributed hashing in a small world

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Fixing the embarrassing slowness of OpenDHT on PlanetLab

WORLDS'05 Proceedings of the 2nd conference on Real, Large Distributed Systems - Volume 2
OASIS: anycast for any service

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3

TFS: a transparent file system for contributory storage

FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Latency and bandwidth-minimizing failure detectors

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Contributing storage using the transparent file system

ACM Transactions on Storage (TOS)
Disruption-aware service composition and recovery in dynamic networking environments

Proceedings of the 2007 workshop on Automating service quality: Held at the International Conference on Automated Software Engineering (ASE)
An analytical study of low delay multi-tree-based overlay multicast

Proceedings of the 2007 workshop on Peer-to-peer streaming and IP-TV
Stochastic analysis of the interplay between object maintenance and churn

Computer Communications
Hierarchical multidimensional search in peer-to-peer networks

Computer Communications
Ranged hash functions and the price of churn

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Resilience of structured P2P systems under churn: The reachable component method

Computer Communications
Designing less-structured P2P systems for the expected high churn

IEEE/ACM Transactions on Networking (TON)
KAF: Kalman Filter Based Adaptive Maintenance for Dependability of Composite Services

CAiSE '08 Proceedings of the 20th international conference on Advanced Information Systems Engineering
Co-designing the failure analysis and monitoring of large-scale systems

ACM SIGMETRICS Performance Evaluation Review
PHIRST: A distributed architecture for P2P information retrieval

Information Systems
Node isolation model and age-based neighbor selection in unstructured P2P networks

IEEE/ACM Transactions on Networking (TON)
Analysis of user-driven peer selection in peer-to-peer backup and storage systems

Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools
A Partition-Based Broadcast Algorithm over DHT for Large-Scale Computing Infrastructures

GPC '09 Proceedings of the 4th International Conference on Advances in Grid and Pervasive Computing
EGOIST: overlay routing using selfish neighbor selection

CoNEXT '08 Proceedings of the 2008 ACM CoNEXT Conference
Modeling and emulation of internet paths

NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Stability and performance of overlay multicast systems employing forward error correction

Performance Evaluation
Finding Good Partners in Availability-Aware P2P Networks

SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
Resolving the Noxious Effect of Churn on Internet Coordinate Systems

IWSOS '09 Proceedings of the 4th IFIP TC 6 International Workshop on Self-Organizing Systems
Efficient data management using the session log in DHT and its evaluation

CCNC'09 Proceedings of the 6th IEEE Conference on Consumer Communications and Networking Conference
Distributed overlay anycast tables using space filling curves

INFOCOM'09 Proceedings of the 28th IEEE international conference on Computer Communications Workshops
Performance evaluation of a Kademlia-based communication-oriented P2P system under churn

Computer Networks: The International Journal of Computer and Telecommunications Networking
Estimating churn in structured P2P networks

ITC20'07 Proceedings of the 20th international teletraffic conference on Managing traffic performance in converged networks
Minimizing node churn in peer-to-peer streaming

Computer Communications
Maintaining the Ranch topology

Journal of Parallel and Distributed Computing
Network coding for distributed storage systems

IEEE Transactions on Information Theory
Using Monte Carlo simulation for improving data availability in P2P network

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
A survey of DHT security techniques

ACM Computing Surveys (CSUR)
An experimental study of peer behavior in a pure P2P network

Journal of Systems and Software
In-degree dynamics of large-scale P2P systems

ACM SIGMETRICS Performance Evaluation Review
Reliability and relay selection in peer-to-peer communication systems

Principles, Systems and Applications of IP Telecommunications
HAPS: supporting effective and efficient full-text P2P search with peer dynamics

Journal of Computer Science and Technology
Securing P2P storage with a self-organizing payment scheme

DPM'10/SETOP'10 Proceedings of the 5th international Workshop on data privacy management, and 3rd international conference on Autonomous spontaneous security
Resource discovery service while minimizing maintenance overhead in hierarchical DHT systems

Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
On the stability of skype super nodes

TMA'11 Proceedings of the Third international conference on Traffic monitoring and analysis
CR-Chord: Improving lookup availability in the presence of malicious DHT nodes

Computer Networks: The International Journal of Computer and Telecommunications Networking
On the impact of seed scheduling in peer-to-peer networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Characterizing the adversarial power in uniform and ergodic node sampling

Proceedings of the First International Workshop on Algorithms and Models for Distributed Event Processing
ChurnDetect: a gossip-based churn estimator for large-scale dynamic networks

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Review: A survey on content-centric technologies for the current Internet: CDN and P2P solutions

Computer Communications
Regular register: an implementation in a churn prone environment

SIROCCO'09 Proceedings of the 16th international conference on Structural Information and Communication Complexity
Performance evaluation of large-scale dynamic systems

ACM SIGMETRICS Performance Evaluation Review
Contextual Trust Aided Enhancement of Data Availability in Peer-to-Peer Backup Storage Systems

Journal of Network and Systems Management
Choosing partners based on availability in P2P networks

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Selfish overlay network creation and maintenance

IEEE/ACM Transactions on Networking (TON)
SybilControl: practical sybil defense with computational puzzles

Proceedings of the seventh ACM workshop on Scalable trusted computing
Glitz: cross-vendor federated file systems

ACM SIGOPS Operating Systems Review
Resource Discovery Service while Minimizing Maintenance Overhead in Hierarchical DHT Systems

International Journal of Adaptive, Resilient and Autonomic Systems
Survey On reliability in publish/subscribe services

Computer Networks: The International Journal of Computer and Telecommunications Networking
Using the complementary nature of node joining and leaving to handle churn problem in P2P networks

Computers and Electrical Engineering
Distance-aware bloom filters: Enabling collaborative search for efficient resource discovery

Future Generation Computer Systems

Quantified Score

Hi-index	0.06

Visualization

Abstract

A pervasive requirement of distributed systems is to deal with churn-change in the set of participating nodes due to joins, graceful leaves, and failures. A high churn rate can increase costs or decrease service quality. This paper studies how to reduce churn by selecting which subset of a set of available nodes to use.First, we provide a comparison of the performance of a range of different node selection strategies in five real-world traces. Among our findings is that the simple strategy of picking a uniform-random replacement whenever a node fails performs surprisingly well. We explain its performance through analysis in a stochastic model.Second, we show that a class of strategies, which we call "Preference List" strategies, arise commonly as a result of optimizing for a metric other than churn, and produce high churn relative to more randomized strategies under realistic node failure patterns. Using this insight, we demonstrate and explain differences in performance for designs that incorporate varying degrees of randomization. We give examples from a variety of protocols, including anycast, over-lay multicast, and distributed hash tables. In many cases, simply adding some randomization can go a long way towards reducing churn.