Bayesian cluster ensembles

Authors:
Hongjun Wang;Hanhuai Shan;Arindam Banerjee
Affiliations:
Information Research Institute, Southwest Jiaotong University, Chengdu, Sichuan, 610031, China;Department of Computer Science & Engineering, University of Minnesota, Twin Cities. Minneapolis, MN 55455;Department of Computer Science & Engineering, University of Minnesota, Twin Cities. Minneapolis, MN 55455
Venue:
Statistical Analysis and Data Mining
Year:
2011

Citing 0
Cited 5

A hybrid ensemble approach for the Steiner tree problem in large graphs: A geographical application

Applied Soft Computing
Projective clustering ensembles

Data Mining and Knowledge Discovery
An Ensemble Topic Model for Sharing Healthcare Data and Predicting Disease Risk

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
A self-supervised framework for clustering ensemble

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Weighted ensemble of algorithms for complex data clustering

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cluster ensembles provide a framework for combining multiple base clusterings of a dataset to generate a stable and robust consensus clustering. There are important variants of the basic cluster ensemble problem, notably including cluster ensembles with missing values, row- or column-distributed cluster ensembles. Existing cluster ensemble algorithms are applicable only to a small subset of these variants. In this paper, we propose Bayesian cluster ensemble (BCE), which is a mixed-membership model for learning cluster ensembles, and is applicable to all the primary variants of the problem. We propose a variational approximation based algorithm for learning Bayesian cluster ensembles. BCE is further generalized to deal with the case where the features of original data points are available, referred to as generalized BCE (GBCE). We compare BCE extensively with several other cluster ensemble algorithms, and demonstrate that BCE is not only versatile in terms of its applicability but also outperforms other algorithms in terms of stability and accuracy. Moreover, GBCE can have higher accuracy than BCE, especially with only a small number of available base clusterings. © 2011 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 4: 54–70 2011 © 2011 Wiley Periodicals, Inc.