K-Means Initialization Methods for Improving Clustering by Simulated Annealing

  • Authors:
  • Gabriela Trazzi Perim;Estefhan Dazzi Wandekokem;Flávio Miguel Varejão

  • Affiliations:
  • Universidade Federal do Espírito Santo, Departamento de Informática, Vitória-ES, Brasil CEP 29060-900;Universidade Federal do Espírito Santo, Departamento de Informática, Vitória-ES, Brasil CEP 29060-900;Universidade Federal do Espírito Santo, Departamento de Informática, Vitória-ES, Brasil CEP 29060-900

  • Venue:
  • IBERAMIA '08 Proceedings of the 11th Ibero-American conference on AI: Advances in Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is defined as the task of dividing a data set such that elements within each subset are similar between themselves and are dissimilar to elements belonging to other subsets. This problem can be understood as an optimization problem that looks for the best configuration of the clusters among all possible configurations. K-means is the most popular approximate algorithm applied to the clustering problem, but it is very sensitive to the start solution and can get stuck in local optima. Metaheuristics can also be used to solve the problem. Nevertheless, the direct application of metaheuristics to the clustering problem seems to be effective only on small data sets. This work suggests the use of methods for finding initial solutions to the K-means algorithm in order to initialize Simulated Annealing and search solutions near the global optima.