Discovery through gossip

  • Authors:
  • Bernhard Haeupler;Gopal Pandurangan;David Peleg;Rajmohan Rajaraman;Zhifeng Sun

  • Affiliations:
  • MIT, Boston, MA, USA;NTU, Singapore, Singapore;Weizmann Institute, Rehovot, Israel;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA

  • Venue:
  • Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study randomized gossip-based processes in dynamic networks that are motivated by information discovery in large-scale distributed networks such as peer-to-peer and social networks. A well-studied problem in peer-to-peer networks is resource discovery, where the goal for nodes (hosts with IP addresses) is to discover the IP addresses of all other hosts. Also, some of the recent work on self-stabilization algorithms for P2P/overlay networks proceed via discovery of the complete network. In social networks, nodes (people) discover new nodes through exchanging contacts with their neighbors (friends). In both cases the discovery of new nodes changes the underlying network --- new edges are added to the network --- and the process continues in the changed network. Rigorously analyzing such dynamic (stochastic) processes in a continuously changing topology remains a challenging problem with obvious applications. This paper studies and analyzes two natural gossip-based discovery processes. In the push discovery or triangulation process, each node repeatedly chooses two random neighbors and connects them (i.e., "pushes" their mutual information to each other). In the pull discovery process or the {\em two-hop walk}, each node repeatedly requests or "pulls" a random contact from a random neighbor and connects itself to this two-hop neighbor. Both processes are lightweight in the sense that the amortized work done per node is constant per round, local, and naturally robust due to the inherent randomized nature of gossip. Our main result is an almost-tight analysis of the time taken for these two randomized processes to converge. We show that in any undirected n-node graph both processes take O(n log2 n) rounds to connect every node to all other nodes with high probability, whereas Ω(n log n) is a lower bound. We also study the two-hop walk in directed graphs, and show that it takes O(n2 log n) time with high probability, and that the worst-case bound is tight for arbitrary directed graphs, whereas Ω(n2) is a lower bound for strongly connected directed graphs. A key technical challenge that we overcome in our work is the analysis of a randomized process that itself results in a constantly changing network leading to complicated dependencies in every round. We discuss implications of our results and their analysis to discovery problems in P2P networks as well as to evolution in social networks.