Refining a divisive partitioning algorithm for unsupervised clustering

  • Authors:
  • Canasai Kruengkrai;Virach Sornlertlamvanich;Hitoshi Isahara

  • Affiliations:
  • Thai Computational Linguistics Laboratory, Communications Research Laboratory, 112 Paholyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand;Thai Computational Linguistics Laboratory, Communications Research Laboratory, 112 Paholyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand;Thai Computational Linguistics Laboratory, Communications Research Laboratory, 112 Paholyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand

  • Venue:
  • Design and application of hybrid intelligent systems
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Principal Direction Divisive Partitioning (PDDP) algorithm is a fast and scalable clustering algorithm [3]. The basic idea is to recursively split the data set into sub-clusters based on principal direction vectors. However, the PDDP algorithm can yield poot results, especially when cluster structures are not well-separated from one another. Its stopping criterion is based on a heuristic that often tends to overestimate the number of clusters. In this paper, we propose simple and efficient solutions to the problems by refining results from the splitting process, and applying the Bayesian Information Criterion (BIC) to estimate the true number of clusters. This motivates a novel algorithm for unsupervised clustering, which its experimental results on different data sets are very encouraging.