Trail-and-Error approach for determining the number of clusters

  • Authors:
  • Haojun Sun;Mei Sun

  • Affiliations:
  • College of Mathematics and Computer Science, University of Hebei, Baoding, Hebei, China;College of Mathematics and Computer Science, University of Hebei, Baoding, Hebei, China

  • Venue:
  • ICMLC'05 Proceedings of the 4th international conference on Advances in Machine Learning and Cybernetics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatically determining the number of clusters is an important issue in cluster analysis. In this paper, we explore “trial-and-error” approach to determining the number of clusters in a given data set. The fuzzy clustering algorithm, FCM, is selected as the basic “trial” algorithm and cluster validity optimization responses to the “error” procedure. To improve the computation speed, we propose two strategies, eliminating and splitting, which allow the FCM-based algorithms more efficient. To improve existing validity measures, we make use of a new validity function that fits particularly data sets containing overlapping clusters. Experimental results are given to illustrate the performance of the new algorithms.