A self-supervised framework for clustering ensemble

Authors:
Liang Du;Yi-Dong Shen;Zhiyong Shen;Jianying Wang;Zhiwu Xu
Affiliations:
State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China,Graduate University of Chinese Academy of Sciences, China,University of Chinese Academy ...;State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China,Graduate University of Chinese Academy of Sciences, China,University of Chinese Academy ...;Baidu Inc., Beijing, China;Computing Center, Shanghai University, Shanghai, China;State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China,Graduate University of Chinese Academy of Sciences, China,University of Chinese Academy ...
Venue:
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Year:
2013

Citing 7
Cited 0

Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data

Machine Learning
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Solving cluster ensemble problems by bipartite graph partitioning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Clustering aggregation

ACM Transactions on Knowledge Discovery from Data (TKDD)
Solving Consensus and Semi-supervised Clustering Problems Using Nonnegative Matrix Factorization

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Bayesian cluster ensembles

Statistical Analysis and Data Mining
Cluster ensembles via weighted graph regularized nonnegative matrix factorization

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering ensemble refers to combine a number of base clusterings for a particular data set into a consensus clustering solution. In this paper, we propose a novel self-supervised learning framework for clustering ensemble. Specifically, we treat the base clusterings as pseudo class labels and learn classifiers for each of them. By adding priors to the parameters of these classifiers, we capture the relationships between different base clusterings and meanwhile obtain a a single consolidated clustering result. In the proposed framework, we are able to incorporate the original data features to improve the performance of clustering ensemble. Another advantage, which distinguishes the proposed framework from the traditional clustering ensemble approaches, is with the generalization capability, i.e. it is able to assign the incoming data instances to the consensus clusters directly based on the original data features. We conduct extensive experiments on multiple real world data sets to show the effectiveness of our method.