k-means Requires Exponentially Many Iterations Even in the Plane

Authors:
Andrea Vattani
Affiliations:
University of California, San Diego, 9500 Gilman Dr., 92093, La Jolla, CA, USA
Venue:
Discrete & Computational Geometry - Special Issue: 25th Annual Symposium on Computational Geometry; Guest Editor: John Hershberger
Year:
2011

Citing 0
Cited 6

Smoothed Analysis of the k-Means Method

Journal of the ACM (JACM)
Scalable k-means++

Proceedings of the VLDB Endowment
The effectiveness of lloyd-type methods for the k-means problem

Journal of the ACM (JACM)
Multi-robot, dynamic task allocation: a case study

Intelligent Service Robotics
A bad instance for k-means++

Theoretical Computer Science
Scalable K-Means by ranked retrieval

Proceedings of the 7th ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

The k-means algorithm is a well-known method for partitioning n points that lie in the d-dimensional space into k clusters. Its main features are simplicity and speed in practice. Theoretically, however, the best known upper bound on its running time (i.e., n O(kd)) is, in general, exponential in the number of points (when kd=Ω(n/log n)). Recently Arthur and Vassilvitskii (Proceedings of the 22nd Annual Symposium on Computational Geometry, pp. 144–153, 2006) showed a super-polynomial worst-case analysis, improving the best known lower bound from Ω(n) to $2^{\varOmega (\sqrt{n})}$ with a construction in $d=\varOmega (\sqrt{n})$ dimensions. In Arthur and Vassilvitskii (Proceedings of the 22nd Annual Symposium on Computational Geometry, pp. 144–153, 2006), they also conjectured the existence of super-polynomial lower bounds for any d≥2. Our contribution is twofold: we prove this conjecture and we improve the lower bound, by presenting a simple construction in the plane that leads to the exponential lower bound 2Ω(n).