A stage by stage pruning algorithm for detecting the number of clusters in a dataset

  • Authors:
  • Yanqiao Zhu;Jinwen Ma

  • Affiliations:
  • Department of Information Science, School of Mathematical Sciences & LMAM, Peking University, Beijing, P.R. China;Department of Information Science, School of Mathematical Sciences & LMAM, Peking University, Beijing, P.R. China

  • Venue:
  • ICIC'10 Proceedings of the 6th international conference on Advanced intelligent computing theories and applications: intelligent computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Determining the number of clusters in a dataset has been one of the most challenging problems in clustering analysis. In this paper, we propose a stage by stage pruning algorithm to detect the cluster number for a dataset. The main idea is that from the dataset we can search for the representative points of clusters with the highest accumulation density and delete the other points from their neighborhoods stage by stage. As the radius of the neighborhood increases, the number of searched representative points decreases. However, the structure of actual clusters of the dataset makes the representative point number be stable at the true cluster number in a relative large interval of the radius, which helps us to detect the cluster number. It is demonstrated by the simulation and practical experiments that the proposed algorithm can lead to an effective estimate of the cluster number for a general dataset.