Clusterer ensemble

  • Authors:
  • Zhi-Hua Zhou;Wei Tang

  • Affiliations:
  • National Laboratory for Novel Software Technology, Nanjing University, Hankou Road 22, Nanjing 210093, China;National Laboratory for Novel Software Technology, Nanjing University, Hankou Road 22, Nanjing 210093, China

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ensemble methods that train multiple learners and then combine their predictions have been shown to be very effective in supervised learning. This paper explores ensemble methods for unsupervised learning. Here, an ensemble comprises multiple clusterers, each of which is trained by k-means algorithm with different initial points. The clusters discovered by different clusterers are aligned, i.e. similar clusters are assigned with the same label, by counting their overlapped data items. Then, four methods are developed to combine the aligned clusterers. Experiments show that clustering performance could be significantly improved by ensemble methods, where utilizing mutual information to select a subset of clusterers for weighted voting is a nice choice. Since the proposed methods work by analyzing the clustering results instead of the internal mechanisms of the component clusterers, they are applicable to diverse kinds of clustering algorithms.