Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
A small approximately min-wise independent family of hash functions
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On clusterings: Good, bad and spectral
Journal of the ACM (JACM)
Mining Social Networks for Targeted Advertising
HICSS '06 Proceedings of the 39th Annual Hawaii International Conference on System Sciences - Volume 06
The dynamics of viral marketing
EC '06 Proceedings of the 7th ACM conference on Electronic commerce
Local Graph Partitioning using PageRank Vectors
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Weighted Graph Cuts without Eigenvectors A Multilevel Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multi-assignment clustering for Boolean data
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Scalable graph clustering using stochastic flows: applications to community discovery
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Local summarization and multi-level LSH for retrieving multi-variant audio tracks
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Signed networks in social media
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
WAW'07 Proceedings of the 5th international conference on Algorithms and models for the web-graph
Multilevel algorithms for partitioning power-law graphs
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Collaborative similarity measure for intra graph clustering
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
Reachability analysis and modeling of dynamic event networks
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Maximizing acceptance probability for active friending in online social networks
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Social influence based clustering of heterogeneous information networks
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
The identification of clusters, well-connected components in a graph, is useful in many applications from biological function prediction to social community detection. However, finding these clusters can be difficult as graph sizes increase. Most current graph clustering algorithms scale poorly in terms of time or memory. An important insight is that many clustering applications need only the subset of best clusters, and not all clusters in the entire graph. In this paper we propose a new technique, Top Graph Clusters (TopGC), which probabilistically searches large, edge weighted, directed graphs for their best clusters in linear time. The algorithm is inherently parallelizable, and is able to find variable size, overlapping clusters. To increase scalability, a parameter is introduced that controls memory use. When compared with three other state-of-the art clustering techniques, TopGC achieves running time speedups of up to 70% on large scale real world datasets. In addition, the clusters returned by TopGC are consistently found to be better both in calculated score and when compared on real world benchmarks.