Maximizing Information Spread through Influence Structures in Social Networks

  • Authors:
  • Saurav Pandit;Yang Yang;Nitesh V. Chawla

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDMW '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining Workshops
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding the most influential nodes in a network is a much discussed research topic of recent time in the area of network science, especially in social network analysis. The topic of this paper is a related, but harder problem. Given a social network where neighbors can influence each other, the problem is to identify k nodes such that if a piece of information is placed on each of those k nodes, the overall spread of that information (via word-of-mouth or other methods of influence flow) is maximized. The amount of information spread can be measured using existing information propagation models. Recent studies, which focus on how quickly k high-influential nodes can be found, tend to ignore the overall effect of the information spread. On the other hand some legacy methods, which look at all possible propagation paths to compute a globally optimal target set, present severe scalability challenges in large-scale networks. We present a simple, yet scalable (polynomial time) algorithm that outperforms the existing state-of-the-art, and its success does not depend significantly on any kind of tuning parameter. To be more precise, when compared to the existing algorithms, the output set of k nodes produced by our algorithm facilitates higher information spread -- in almost all the instances, consistently across the commonly used information propagation models. The original algorithm in this paper, although scalable, can have higher running time than some standard approaches, e.g. simply picking the top k nodes with highest degree or highest PageRank value. To that end, we provide an optional speedup mechanism that considerably reduces the time complexity while not significantly affecting the quality of results vis-a-vis the full version of our algorithm.