k-Means - A Generalized k-Means Clustering Algorithm with Unknown Cluster Number

  • Authors:
  • Yiu-ming Cheung

  • Affiliations:
  • -

  • Venue:
  • IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new clustering technique named STepwise Automatic Rival-penalized (STAR) k-means algorithm (denoted as k*-means), which is actually a generalized version of the conventional k-means (MacQueen 1967). Not only is this new algorithm applicable to ellipse-shaped data clusters rather than just to ball-shaped ones like the k-means algorithm, but also it can perform appropriate clustering without knowing cluster number by gradually penalizing the winning chance of those extra seed points during learning competition. Although the existing RPCL (Xu et al. 1993) can automatically select the cluster number as well by driving extra seed points far away from the input data set, its performance is much sensitive to the selection of the de-learning rate. To our best knowledge, there is still no theoretical result to guide its selection as yet. In contrast, the proposed k*-means algorithm need not determine this rate. We have qualitatively analyzed its rival-penalized mechanism with the results well-justified by the experiments.