Fast k-means algorithms with constant approximation

  • Authors:
  • Mingjun Song;Sanguthevar Rajasekaran

  • Affiliations:
  • Computer Science and Engineering, University of Connecticut, Storrs, CT;Computer Science and Engineering, University of Connecticut, Storrs, CT

  • Venue:
  • ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we study the k-means clustering problem. It is well-known that the general version of this problem is $\mathcal{NP}$-hard. Numerous approximation algorithms have been proposed for this problem. In this paper, we proposed three constant approximation algorithms for k-means clustering. The first algorithm runs in time $O(({{k}\over{\epsilon}})^{k}nd)$, where k is the number of clusters, n is the size of input points, d is dimension of attributes. The second algorithm runs in time O(k3n2log n). This is the first algorithm for k-means clustering that runs in time polynomial in n, k and d. The run time of the third algorithm (O(k5 log3kd)) is independent of n. Though an algorithm whose run time is independent of n is known for the k-median problem, ours is the first such algorithm for the k-means problem.