From comparing clusterings to combining clusterings

Authors:
Zhiwu Lu;Yuxin Peng;Jianguo Xiao
Affiliations:
Institute of Computer Science and Technology, Peking University, Beijing, China;Institute of Computer Science and Technology, Peking University, Beijing, China;Institute of Computer Science and Technology, Peking University, Beijing, China
Venue:
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Year:
2008

Citing 8
Cited 2

Unsupervised Learning of Finite Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

Data Mining and Knowledge Discovery
Path-Based Clustering for Grouping of Smooth Curves and Texture Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Cluster ensembles: a knowledge reuse framework for combining partitionings

Eighteenth national conference on Artificial intelligence
Combining Multiple Weak Clusterings

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Combining Multiple Clusterings Using Evidence Accumulation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Clustering Ensembles: Models of Consensus and Weak Partitions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Toward Objective Evaluation of Image Segmentation Algorithms

IEEE Transactions on Pattern Analysis and Machine Intelligence

Constrained spectral clustering via exhaustive and efficient constraint propagation

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
A theoretic framework of K-means-based consensus clustering

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper presents a fast simulated annealing framework for combining multiple clusterings (i.e. clustering ensemble) based on some measures of agreement between partitions, which are originally used to compare two clusterings (the obtained clustering vs. a ground truth clustering) for the evaluation of a clustering algorithm. Though we can follow a greedy strategy to optimize these measures as objective functions of clustering ensemble, some local optima may be obtained and simultaneously the computational cost is too large. To avoid the local optima, we then consider a simulated annealing optimization scheme that operates through single label changes. Moreover, for these measures between partitions based on the relationship (joined or separated) of pairs of objects such as Rand index, we can update them incrementally for each label change, which makes sure the simulated annealing optimization scheme is computationally feasible. The simulation and real-life experiments then demonstrate that the proposed framework can achieve superior results.