CC-MR --- finding connected components in huge graphs with mapreduce
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Benchmarking graph-processing platforms: a vision
Proceedings of the 5th ACM/SPEC international conference on Performance engineering
Hi-index | 0.00 |
The connected component of an undirected graph plays an important part in graph theory. It is straightforward to compute the connected components of a graph in linear time using either breadth-first search or depth-first search. However when confronted with large scale data, both of the two algorithms are hard to execute. In this paper, we introduce a recently proposed community detection technique by label propagation discussed in [1]. And based on the label propagation algorithm (LPA), we propose a method to compute the connected components of an undirected graph. The method is on top of cluster system with the help of MapReduce, and implemented to fully utilize MapReduce execution mechanism, namely the “map-reduce” process. Moreover, considering how our algorithm can be applied in further “cloud” service, we employ several large scale datasets to demonstrate the efficiency and scalability of our solutions.