How fast is the k-means method?

  • Authors:
  • Sariel Har-Peled;Bardia Sadri

  • Affiliations:
  • University of Illinois/, Urbana, IL;University of Illinois/, Urbana, IL

  • Venue:
  • SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present polynomial upper and lower bounds on the number of iterations performed by the k-means method (a.k.a. Lloyd's method) for k-means clustering. Our upper bounds are polynomial in the number of points, number of clusters, and the spread of the point set. We also present a lower bound, showing that in the worst case the k-means heuristic needs to perform Ω(n) iterations, for n points on the real line and two centers. Surprisingly, the spread of the point set in this construction is polynomial. This is the first construction showing that the k-means heuristic requires more than a polylogarithmic number of iterations. Furthermore, we present two alternative algorithms, with guaranteed performance, which are simple variants of the k-means method.