On consensus clustering validation

  • Authors:
  • João M. M. Duarte;Ana L. N. Fred;André Lourenço;F. Jorge F. Duarte

  • Affiliations:
  • Instituto de Telecomunicações, Instituto Superior Técnico, Lisboa, Portugal and GECAD, Instituto Superior de Engenharia do Porto, Porto, Portugal;Instituto de Telecomunicações, Instituto Superior Técnico, Lisboa, Portugal;Instituto de Telecomunicações, Instituto Superior Técnico, Lisboa, Portugal;GECAD, Instituto Superior de Engenharia do Porto, Porto, Portugal

  • Venue:
  • SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Work on clustering combination has shown that clustering combination methods typically outperform single runs of clustering algorithms. While there is much work reported in the literature on validating data partitions produced by the traditional clustering algorithms, little has been done in order to validate data partitions produced by clustering combination methods. We propose to assess the quality of a consensus partition using a pattern pairwise similarity induced from the set of data partitions that constitutes the clustering ensemble. A new validity index based on the likelihood of the data set given a data partition, and three modified versions of well-known clustering validity indices are proposed. The validity measures on the original, clustering ensemble, and similarity spaces are analysed and compared based on experimental results on several synthetic and real data sets.