Matrix multiplication via arithmetic progressions
Journal of Symbolic Computation - Special issue on computational algebraic complexity
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Efficient semi-streaming algorithms for local triangle counting in massive graphs
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Main-memory triangle computations for very large (sparse (power-law)) graphs
Theoretical Computer Science
DOULION: counting triangles in massive graphs with a coin
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Graph Twiddling in a MapReduce World
Computing in Science and Engineering
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Counting triangles and the curse of the last reducer
Proceedings of the 20th international conference on World wide web
Triangle listing in massive networks and its applications
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Uncovering social network sybils in the wild
Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
Finding, counting and listing all triangles in large graphs, an experimental study
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
Matrix chain multiplication via multi-way join algorithms in MapReduce
Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
Hadoop: The Definitive Guide
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Hi-index | 0.00 |
Triangle counting problem is one of the fundamental problem in various domains. The problem can be utilized for computation of clustering coefficient, transitivity, trianglular connectivity, trusses, etc. The problem have been extensively studied in internal memory but the algorithms are not scalable for enormous graphs. In recent years, the MapReduce has emerged as a de facto standard framework for processing large data through parallel computing. A MapReduce algorithm was proposed for the problem based on graph partitioning. However, the algorithm redundantly generates a large number of intermediate data that cause network overload and prolong the processing time. In this paper, we propose a new algorithm based on graph partitioning with a novel idea of triangle classification to count the number of triangles in a graph. The algorithm substantially reduces the duplication by classifying triangles into three types and processing each triangle differently according to its type. In the experiments, we compare the proposed algorithm with recent existing algorithms using both synthetic datasets and real-world datasets that are composed of millions of nodes and billions of edges. The proposed algorithm outperforms other algorithms in most cases. Especially, for a twitter dataset, the proposed algorithm is more than twice as fast as existing MapReduce algorithms. Moreover, the performance gap increases as the graph becomes larger and denser.