A Distributed Algorithm to Enumerate All Maximal Cliques in MapReduce

  • Authors:
  • Bin Wu;Shengqi Yang;Haizhou Zhao;Bai Wang

  • Affiliations:
  • -;-;-;-

  • Venue:
  • FCST '09 Proceedings of the 2009 Fourth International Conference on Frontier of Computer Science and Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Structure mining plays an important part in the researches in biology, physics, internet or telecommunications in recently emerging network science. As a main task in this area, the problem of maximal clique enumeration has attracted much interest and been studied in variant avenues in prior works. However, most of these works mainly rely on single chip computational capacity and have been constrained by local optimization. Thus it is an impossible mission for these methods to process terabytes datasets. In this paper, to extract maximal cliques from graphs, we propose a general enumeration process in a distributed manner on cluster system with the help of MapReduce. Graph is firstly split into small subgraphs automatically. Then a novel key-based clique enumeration algorithm is proposed based on subgraphs. We demonstrate that our algorithm has a high parallelism and a prominent performance on extremely huge graphs. Our method is implemented to fully utilize MapReduce execution mechanism and the experiments are soundly discussed as using such a powerful distributed platform. However we not only show the scalability and efficiency of the algorithm but also share some critical experience in using MapReduce computing model.