Weighted k-means for density-biased clustering

  • Authors:
  • Kittisak Kerdprasop;Nittaya Kerdprasop;Pairote Sattayatham

  • Affiliations:
  • Data Engineering and Knowledge Discovery Research Unit, School of Computer Engineering, Suranaree University of Technology, Nakhon Ratchasima, Thailand;Data Engineering and Knowledge Discovery Research Unit, School of Computer Engineering, Suranaree University of Technology, Nakhon Ratchasima, Thailand;School of Mathematics, Suranaree University of Technology, Nakhon Ratchasima, Thailand

  • Venue:
  • DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is a task of grouping data based on similarity. A popular k-means algorithm groups data by firstly assigning all data points to the closest clusters, then determining the cluster means. The algorithm repeats these two steps until it has converged. We propose a variation called weighted k-means to improve the clustering scalability. To speed up the clustering process, we develop the reservoir-biased sampling as an efficient data reduction technique since it performs a single scan over a data set. Our algorithm has been designed to group data of mixture models. We present an experimental evaluation of the proposed method.