Fast Exact Computation of betweenness Centrality in Social Networks

  • Authors:
  • Miriam Baglioni;Filippo Geraci;Marco Pellegrini;Ernesto Lastres

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Social networks have demonstrated in the last few years to be a powerful and flexible concept useful to represent and analyze data emerging form social interactions and social activities. The study of these networks can thus provide a deeper understanding of many emergent global phenomena. The amount of data available in the form of social networks data is growing by the day, and this poses many computational challenging problems for their analysis. In fact many analysis tools suitable to analyze small to medium sized networks are inefficient for large social networks. The computation of the betweenness centrality index is a well established method for network data analysis and it is also important as subroutine in more advanced algorithms, such as the Girvan-Newman method for graph partitioning. In this paper we present a new approach for the computation of the betweenness centrality, which speeds up considerably Brandes' algorithm (the current state of the art) in the context of social networks. Our approach exploits the natural sparsity of the data to algebraically (and efficiently) determine the betweenness of those nodes forming trees (tree-nodes) in the social network. Moreover, for the residual network, which is often of much smaller size, we modify directly the Brandes' algorithm so that we can remove the nodes already processed and perform the computation of the shortest paths only for the residual nodes. Tests conducted on a sample of publicly available large networks from the Stanford repository show that improvements of a factor ranging between 2 and 5 are possible on several such graphs, when the sparsity, measured by the ratio of tree-nodes to the total number of nodes, is in a medium range (30% to 50%). For some large networks from the Stanford repository and for a sample of social networks provided by Sistemi Territoriali with high sparsity (80% and above) tests show that our algorithm consistently runs between one and two orders of magnitude faster than the current state of the art exact algorithm.