Cluster Ensemble Selection

Authors:
Xiaoli Z. Fern;Wei Lin
Affiliations:
School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA;School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
Venue:
Statistical Analysis and Data Mining
Year:
2008

Citing 0
Cited 10

Adaptive cluster ensemble selection

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
A hierarchical information theoretic technique for the discovery of non linear alternative clusterings

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Bagging-based spectral clustering ensemble selection

Pattern Recognition Letters
A metric to evaluate a cluster by eliminating effect of complement cluster

KI'11 Proceedings of the 34th Annual German conference on Advances in artificial intelligence
Multi-objective design of hierarchical consensus functions for clustering ensembles via genetic programming

Decision Support Systems
Optimal clustering in the context of overlapping cluster analysis

Information Sciences: an International Journal
Cluster ensemble selection based on relative validity indexes

Data Mining and Knowledge Discovery
A probabilistic approach to latent cluster analysis

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Generating multiple alternative clusterings via globally optimal subspaces

Data Mining and Knowledge Discovery
Ensembles for unsupervised outlier detection: challenges and research questions a position paper

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper studies the ensemble selection problem for unsupervised learning. Given a large library of different clustering solutions, our goal is to select a subset of solutions to form a smaller yet better-performing cluster ensemble than using all available solutions. We design our ensemble selection methods based on quality and diversity, the two factors that have been shown to influence cluster ensemble performance. Our investigation revealed that using quality or diversity alone may not consistently achieve improved performance. Based on our observations, we designed three different selection approaches that jointly consider these two factors. We empirically evaluated their performance in comparison with both full ensembles and a random selection strategy. Our results indicate that by explicitly considering both quality and diversity in ensemble selection, we can achieve statistically significant performance improvement over full ensembles. Copyright © 2008 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 1: 000-000, 2008