Comparison Among Methods for k Estimation in k-means

  • Authors:
  • Murilo C. Naldi;André Fontana;Ricardo J. G. B. Campello

  • Affiliations:
  • -;-;-

  • Venue:
  • ISDA '09 Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the most influential algorithms in data mining, k-means, is broadly used in practical tasks for its simplicity, computational efficiency and effectiveness in high dimensional problems. However, k-means has two major drawbacks, which are the need to choose the number of clusters, k, and the sensibility to the initial prototypes’ position. In this work, systematic, evolutionary and order heuristics used to suppress these drawbacks are compared. 27 variants of 4 algorithmic approaches are used to partition 324 synthetic data sets and the obtained results are compared.