Adaptive Sampling for k-Means Clustering

  • Authors:
  • Ankit Aggarwal;Amit Deshpande;Ravi Kannan

  • Affiliations:
  • IIT Delhi,;Microsoft Research, India;Microsoft Research, India

  • Venue:
  • APPROX '09 / RANDOM '09 Proceedings of the 12th International Workshop and 13th International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show that adaptively sampled O (k ) centers give a constant factor bi-criteria approximation for the k -means problem, with a constant probability. Moreover, these O (k ) centers contain a subset of k centers which give a constant factor approximation, and can be found using LP-based techniques of Jain and Vazirani [JV01] and Charikar et al. [CGTS02]. Both these algorithms run in effectively O (nkd ) time and extend the O (logk )-approximation achieved by the k -means++ algorithm of Arthur and Vassilvitskii [AV07].