Epidemic algorithms for replicated database maintenance
PODC '87 Proceedings of the sixth annual ACM Symposium on Principles of distributed computing
Seven good reasons for mobile agents
Communications of the ACM
Correctness of a gossip based membership protocol
Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Gossiping in distributed systems
ACM SIGOPS Operating Systems Review - Gossip-based computer networking
The "art" of programming gossip-based systems
ACM SIGOPS Operating Systems Review - Gossip-based computer networking
T-Man: Gossip-based fast overlay topology construction
Computer Networks: The International Journal of Computer and Telecommunications Networking
Hi-index | 0.00 |
Gossip protocols are known to be highly robust in scenarios with high churn, but if the data that is being gossiped becomes corrupted, a protocol's very robustness can make it hard to fix the problem. All participants need to be taken down, any disk-based data needs to be scrubbed, the cause of the corruption needs to be fixed, and only then can participants be restarted. If even a single participant is skipped in this process, say because it was temporarily unreachable, then it can contaminate the entire system all over again. We describe the design and implementation of a new middleware for gossip protocols that addresses this problem. Our middleware offers the ability to update code dynamically and provides a small resilient core that allows updating code that has failed catastrophically. Our initial PlanetLab-based deployment demonstrates that the middleware is efficient.