Maximum likelihood combination of multiple clusterings

Authors:
Tianming Hu;Ying Yu;Jinzhi Xiong;Sam Yuan Sung
Affiliations:
Department of Computer Science, DongGuan University of Technology, 1 University Road, DongGuan, GuangDong 523808, China;Department of Computer Science, DongGuan University of Technology, 1 University Road, DongGuan, GuangDong 523808, China;Department of Computer Science, DongGuan University of Technology, 1 University Road, DongGuan, GuangDong 523808, China;Department of Computer Science, South Texas College, McAllen, TX 78501, United States
Venue:
Pattern Recognition Letters
Year:
2006

Citing 12
Cited 2

Algorithms for clustering data

Algorithms for clustering data
The Strength of Weak Learnability

Machine Learning
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Data clustering: a review

ACM Computing Surveys (CSUR)
Distributed clustering using collective principal component analysis

Knowledge and Information Systems
Evaluation of hierarchical clustering algorithms for document datasets

Proceedings of the eleventh international conference on Information and knowledge management
A Multi-clustering Fusion Algorithm

SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
Evidence Accumulation Clustering Based on the K-Means Algorithm

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Multiclassifier Systems: Back to the Future

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
A clustering method based on boosting

Pattern Recognition Letters
Iterative optimization and simplification of hierarchical clusterings

Journal of Artificial Intelligence Research
Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

IEEE Transactions on Pattern Analysis and Machine Intelligence

Collaborative clustering with background knowledge

Data & Knowledge Engineering
A hierarchical clusterer ensemble method based on boosting theory

Knowledge-Based Systems

Quantified Score

Hi-index	0.10

Visualization

Abstract

A promising direction for more robust clustering is to derive multiple candidate clusterings over a common set of objects and then combine them into a consolidated one, which is expected to be better than any candidate. Given a candidate clustering set, we show that with a particular pairwise potential used in Markov random fields, the maximum likelihood estimation is the one closest to the set in terms of a metric distance between clusterings. To minimize such a distance, we present two combining methods based on the new similarity determined by the whole candidate set. We evaluate them on both artificial and real datasets, with candidate clusterings either from full space or subspace. Experiments show that they not only lead to a closer distance to the candidate set, but also achieve a smaller or comparable distance to the true clustering.