Failure Detection and Membership Management in Grid Environments
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
A Self-Organizing Super-Peer Overlay with a Chord Core for Desktop Grids
IWSOS '08 Proceedings of the 3rd International Workshop on Self-Organizing Systems
Hi-index | 0.00 |
Gossip protocols and services provide a means by which failures can be detected in large, distributed systems in an asynchronous manner without the limits associated with reliable multicasting for group communications. Extending the gossip protocol such that a system reaches consensus on detected faults can be performed via a flat structure, or it can be hierarchically distributed across cooperating layers of nodes. In this paper, the performance of gossip services employing flat and hierarchical schemes is analyzed on an experimental testbed in terms of consensus time, resource utilization and scalability. Performance associated with a hierarchically arranged gossip scheme is analyzed with varying group sizes and is shown to scale well. Resource utilization of the gossip-style failure detection and consensus service is measured in terms of network bandwidth utilization and CPU utilization. Analytical models are developed for resource utilization and performance projections are made for large system sizes.