A note on ball segment picking related to clustering

  • Authors:
  • Nicolas Wicker

  • Affiliations:
  • Laboratoire de Bioinformatique et de Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/University of Strasbourg, BP 10142 ...

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2011

Quantified Score

Hi-index 0.10

Visualization

Abstract

An important issue in clustering is the automatic determination of a number of clusters close to the true one. The aim of this paper is to revisit a method called density of points clustering (DPC) that tackles this problem by comparing the density inside a cluster and between two potential sub-clusters. Light is shed on the geometric probability aspect of this method by giving a closed-form formula on the probability distribution of the points generated by picking two points inside a p-dimensional ball (ball segment picking) and taking the middle of them. This sampling procedure is indeed at the heart of DPC. The result shows that such sampled points tend to be more concentrated towards the ball center than the uniform sampled points. The contribution of this study is to explain why DPC can produce good results.