Exploiting and Evaluating MapReduce for Large-Scale Graph Mining

  • Authors:
  • Hung-Che Lai;Cheng-Te Li;Yi-Chen Lo;Shou-De Lin

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Graph mining is a popular technique for discovering the hidden structures or important instances in a graph, but the computational efficiency is usually a cause for concern when dealing with large-scale graphs containing billions of entities. Cloud computing is widely regarded as a feasible solution to the problem. In this work, we present an open source graph mining library called the MapReduce Graph Mining Framework (MGMF) to be a robust and efficient MapReduce-based graph mining tool. We start from dividing graph mining algorithms into four categories and designing a MapReduce framework for algorithms in each category. The experimental results show that MGMF is 3 to 20 times more efficient than PEGASUS, a state-of-the-art library for graph mining on MapReduce. Moreover, it provides better coverage of different graph mining algorithms. We also validate our framework on billion-scaled networks to demonstrate that it is scalable to the number of machines. Fur-thermore, we test and compare the feasibility between single ma-chine and the cloud computing technique. The effects of different file input formats for MapReduce are investigated as well. Our implemented open-source library can be downloaded from http://mslab.csie.ntu.edu.tw/~noahsark/MGMF/.