A fast algorithm for clustering with mapreduce

  • Authors:
  • Yuqing Miao;Jinxing Zhang;Hao Feng;Liangpei Qiu;Yimin Wen

  • Affiliations:
  • School of Computer science and Engineering, Guilin University of Electronic Technology, Guilin, China;School of Computer science and Engineering, Guilin University of Electronic Technology, Guilin, China;School of Computer science and Engineering, Guilin University of Electronic Technology, Guilin, China;School of Computer science and Engineering, Guilin University of Electronic Technology, Guilin, China;School of Computer science and Engineering, Guilin University of Electronic Technology, Guilin, China

  • Venue:
  • ISNN'13 Proceedings of the 10th international conference on Advances in Neural Networks - Volume Part I
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce is a popular model in which the dataflow takes the form of a directed acyclic graph of operators. But it lacks built-in support for iterative programs, which arise naturally in many clustering applications. Based on micro-cluster and equivalence relation, we design a clustering algorithm which can be easily parallelized in MapReduce and done in quite a few MapReduce rounds. Experiments show that our algorithm not only runs fast and obtains good accuracy but also scales well and possesses high speedup.