A Robust Methodology for Comparing Performances of Clustering Validity Criteria

  • Authors:
  • Lucas Vendramin;Ricardo J. Campello;Eduardo R. Hruschka

  • Affiliations:
  • Department of Computer Sciences, University of São Paulo at São Carlos SCC/ICMC/USP, C.P. 668, São Carlos, Brazil 13560-970;Department of Computer Sciences, University of São Paulo at São Carlos SCC/ICMC/USP, C.P. 668, São Carlos, Brazil 13560-970;Department of Computer Sciences, University of São Paulo at São Carlos SCC/ICMC/USP, C.P. 668, São Carlos, Brazil 13560-970

  • Venue:
  • SBIA '08 Proceedings of the 19th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many different clustering validity measures exist that are very useful in practice as quantitative criteria for evaluating the quality of data partitions. However, it is a hard task for the user to choose a specific measure when he or she faces such a variety of possibilities. The present paper introduces an alternative, robust methodology for comparing clustering validity measures that has been especially designed to get around some conceptual flaws of the comparison paradigm traditionally adopted in the literature. An illustrative example involving the comparison of the performances of four well-known validity measures over a collection of 7776 data partitions of 324 different data sets is presented.