Learning the graph of epidemic cascades

  • Authors:
  • Praneeth Netrapalli;Sujay Sanghavi

  • Affiliations:
  • The University of Texas at Austin, Austin, TX, USA;The University of Texas at Austin, Austin, TX, USA

  • Venue:
  • Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of finding the graph on which an epidemic spreads, given only the times when each node gets infected. While this is a problem of central importance in several contexts -- offline and online social networks, e-commerce, epidemiology -- there has been very little work, analytical or empirical, on finding the graph. Clearly, it is impossible to do so from just one epidemic; our interest is in learning the graph from a small number of independent epidemics. For the classic and popular "independent cascade" epidemics, we analytically establish sufficient conditions on the number of epidemics for both the global maximum-likelihood (ML) estimator, and a natural greedy algorithm to succeed with high probability. Both results are based on a key observation: the global graph learning problem decouples into n local problems -- one for each node. For a node of degree d, we show that its neighborhood can be reliably found once it has been infected O(d2 log n) times (for ML on general graphs) or O(d log n) times (for greedy on trees). We also provide a corresponding information-theoretic lower bound of Ω(d log n); thus our bounds are essentially tight. Furthermore, if we are given side-information in the form of a super-graph of the actual graph (as is often the case), then the number of epidemic samples required -- in all cases -- becomes independent of the network size n.