Ensembles of Partitions via Data Resampling

  • Authors:
  • Behrouz Minaei-Bidgoli;Alexander Topchy;William F. Punch

  • Affiliations:
  • -;-;-

  • Venue:
  • ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The combination of multiple clusterings is a difficultproblem in the practice of distributed data mining. Boththe cluster generation mechanism and the partitionintegration process influence the quality of thecombinations. In this paper we propose a dataresampling approach for building cluster ensembles thatare both robust and stable. In particular, we investigatethe effectiveness of a bootstrapping technique inconjunction with several combination algorithms. Theempirical study shows that a meaningful consensuspartition for an entire set of objects emerges frommultiple clusterings of bootstrap samples, given optimalcombination algorithm parameters. Experimental resultsfor ensembles with varying numbers of partitions andclusters are reported for simulated and real data sets.Experimental results show improved stability andaccuracy for consensus partitions obtained via abootstrapping technique.