AVMEM - availability-aware overlays for management operations in non-cooperative distributed systems
Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware
Computer Networks: The International Journal of Computer and Telecommunications Networking
Proceedings of the 28th ACM symposium on Principles of distributed computing
Finding Good Partners in Availability-Aware P2P Networks
SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
Optimizing peer-to-peer backup using lifetime estimations
Proceedings of the 2009 EDBT/ICDT Workshops
AVMEM: availability-aware overlays for management operations in non-cooperative distributed systems
MIDDLEWARE2007 Proceedings of the 8th ACM/IFIP/USENIX international conference on Middleware
Choosing partners based on availability in P2P networks
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Efficient computation of distance sketches in distributed networks
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Journal of the ACM (JACM)
Hi-index | 0.00 |
This paper addresses the problem of selection and discovery of a consistent availability monitoring overlay for computer hosts in a large-scale distributed application, where hosts may be selfish or colluding. We motivate six significant goals for the problem - consistency, verifiability, and randomness, in selecting the availability monitors of nodes, as well as discoverability, load-balancing, and scalability in finding these monitors. We then present a new system, called AVMON, that is the first to satisfy these six requirements. The core algorithmic contribution of this paper is a protocol for discovering the availability monitoring overlay in a scalable and efficient manner, given any arbitrary monitor selection scheme that is consistent and verifiable. We mathematically analyze the performance of AVMON's discovery protocols, and derive an optimal variant that minimizes memory, bandwidth, computation, and discovery time of monitors. Our experimental evaluations of AVMON use three types of availability traces - synthetic, from PlanetLab, and from a peer-to-peer system (Overnet) - and demonstrate that AVMON works well in a variety of distributed systems.