Clique percolation method for finding naturally cohesive and overlapping document clusters

Authors:
Wei Gao;Kam-Fai Wong;Yunqing Xia;Ruifeng Xu
Affiliations:
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, China;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, China;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, China;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, China
Venue:
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Year:
2006

Citing 10
Cited 1

Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Distributional clustering of words for text classification

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Data clustering: a review

ACM Computing Surveys (CSUR)
Algorithm 457: finding all cliques of an undirected graph

Communications of the ACM
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
Document clustering with cluster refinement and model selection capabilities

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Algorithms

Introduction to Algorithms
A Min-max Cut Algorithm for Graph Partitioning and Data Clustering

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Low-complexity fuzzy relational clustering algorithms for Web mining

IEEE Transactions on Fuzzy Systems

Finding collections of k-clique percolated components in attributed graphs

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Techniques for find document clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stray away from the intrinsic grouping natures of a document collection. We apply a novel graph-theoretic technique called Clique Percolation Method (CPM) for document clustering. In this method, a process of enumerating highly cohesive maximal document cliques is performed in a random graph, where those strongly adjacent cliques are mingled to form naturally overlapping clusters. Our clustering results can unveil the inherent structural connections of the underlying data. Experiments show that CPM can outperform some typical algorithms on benchmark data sets, and shed light on its advantages on natural document clustering.