A method for initialising the K-means clustering algorithm using kd-trees

  • Authors:
  • Stephen J. Redmond;Conor Heneghan

  • Affiliations:
  • Department of Electronic Engineering, University College Dublin, Belfield, Dublin 4, Ireland;Department of Electronic Engineering, University College Dublin, Belfield, Dublin 4, Ireland

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2007

Quantified Score

Hi-index 0.10

Visualization

Abstract

We present a method for initialising the K-means clustering algorithm. Our method hinges on the use of a kd-tree to perform a density estimation of the data at various locations. We then use a modification of Katsavounidis' algorithm, which incorporates this density information, to choose K seeds for the K-means algorithm. We test our algorithm on 36 synthetic datasets, and 2 datasets from the UCI Machine Learning Repository, and compare with 15 runs of Forgy's random initialisation method, Katsavounidis' algorithm, and Bradley and Fayyad's method.