Neighbor number, valley seeking and clustering

  • Authors:
  • Chaolin Zhang;Xuegong Zhang;Michael Q. Zhang;Yanda Li

  • Affiliations:
  • Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China and Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA and Depar ...;Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China;Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China and Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA;Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2007

Quantified Score

Hi-index 0.10

Visualization

Abstract

This paper proposes a novel nonparametric clustering algorithm capable of identifying shape-free clusters. This algorithm is based on a nonparametric estimation of the normalized density derivative (NDD) and the local convexity of the density distribution function, both of which are represented in a very concise form in terms of neighbor numbers. We use NDD to measure the dissimilarity between each pair of observations in a local neighborhood and to build a connectivity graph. Combined with the local convexity, this similarity measure can detect observations in local minima (valleys) of the density function, which separate observations in different major clusters. We demonstrate that this algorithm has a close relationship with the single-linkage hierarchical clustering and can be viewed as its extension. The performance of the algorithm is tested with both synthetic and real datasets. An example of color image segmentation is also given. Comparisons with several representative existing algorithms show that the proposed method can robustly identify major clusters even when there are complex configurations and/or large overlaps.