Enhanced bisecting k-means clustering using intermediate cooperation

  • Authors:
  • R. Kashef;M. S. Kamel

  • Affiliations:
  • University of Waterloo, Electrical and Computer Engineering Department, Waterloo, Canada;University of Waterloo, Electrical and Computer Engineering Department, Waterloo, Canada

  • Venue:
  • Pattern Recognition
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Bisecting k-means (BKM) is very attractive in many applications as document-retrieval/indexing and gene expression analysis problems. However, in some scenarios when a fraction of the dataset is left behind with no other way to re-cluster it again at each level of the binary tree, a ''refinement'' is needed to re-cluster the resulting solutions. Current approaches to refine the clustering solutions produced by the BKM employ end-result enhancement using k-means (KM) clustering. In this hybrid model, KM waits for the former BKM to finish its clustering and then it takes the final set of centroids as initial seeds for a better refinement. In this paper, a cooperative bisecting k-means (CBKM) clustering algorithm is presented. The CBKM concurrently combines the results of the BKM and KM at each level of the binary hierarchical tree using cooperative and merging matrices. Undertaken experimental results show that the CBKM achieves better clustering quality than that of KM, BKM, and single linkage (SL) algorithms with comparable time performance over a number of artificial, text documents, and gene expression datasets.