A bad instance for k-means++

Authors:
Tobias Brunsch;Heiko Röglin
Affiliations:
-;-
Venue:
Theoretical Computer Science
Year:
2013

Citing 8
Cited 0

The Effectiveness of Lloyd-Type Methods for the k-Means Problem

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
k-means++: the advantages of careful seeding

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
The Planar k-Means Problem is NP-Hard

WALCOM '09 Proceedings of the 3rd International Workshop on Algorithms and Computation
NP-hardness of Euclidean sum-of-squares clustering

Machine Learning
Adaptive Sampling for k-Means Clustering

APPROX '09 / RANDOM '09 Proceedings of the 12th International Workshop and 13th International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
k-means Requires Exponentially Many Iterations Even in the Plane

Discrete & Computational Geometry - Special Issue: 25th Annual Symposium on Computational Geometry; Guest Editor: John Hershberger
Smoothed Analysis of the k-Means Method

Journal of the ACM (JACM)
Least squares quantization in PCM

IEEE Transactions on Information Theory

Quantified Score

Hi-index	5.23

Visualization

Abstract

k-means++ is a seeding technique for the k-means method with an expected approximation ratio of O(logk), where k denotes the number of clusters. Examples are known on which the expected approximation ratio of k-means++ is @W(logk), showing that the upper bound is asymptotically tight. However, it remained open whether k-means++ yields a constant approximation with probability 1/poly(k) or even with constant probability. We settle this question and present instances on which k-means++ achieves an approximation ratio no better than (2/3-@e)@?logk with probability exponentially close to 1.