PATRIC: a parallel algorithm for counting triangles in massive networks

  • Authors:
  • Shaikh Arifuzzaman;Maleq Khan;Madhav Marathe

  • Affiliations:
  • NDSSL, Virginia Bioinformatics Institute, Virginia Tech., Blacksburg, Virginia, USA;NDSSL, Virginia Bioinformatics Institute, Virginia Tech., Blacksburg, Virginia, USA;NDSSL, Virginia Bioinformatics Institute, Virginia Tech., Blacksburg, Virginia, USA

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Massive networks arising in numerous application areas poses significant challenges for network analysts as these networks grow to billions of nodes and are prohibitively large to fit in the main memory. Finding the number of triangles in a network is an important problem in the analysis of complex networks. Several interesting graph mining applications depend on the number of triangles in the graph. In this paper, we present an efficient MPI-based distributed memory parallel algorithm, called PATRIC, for counting triangles in massive networks. PATRIC scales well to networks with billions of nodes and can compute the exact number of triangles in a network with one billion nodes and 10 billion edges in 16 minutes. Balancing computational loads among processors for a graph problem like counting triangles is a challenging issue. We present and analyze several schemes for balancing load among processors for the triangle counting problem. These schemes achieve very good load balancing. We also show how our parallel algorithm can adapt an existing edge sparsification technique to approximate the number of triangles with very high accuracy. This modification allows us to count triangles in even larger networks.