Approximate clustering via core-sets
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
An Efficient k-Means Clustering Algorithm: Analysis and Implementation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time
Journal of the ACM (JACM)
A local search approximation algorithm for k-means clustering
Computational Geometry: Theory and Applications - Special issue on the 18th annual symposium on computational geometrySoCG2002
A Simple Linear Time (1+ ") -Approximation Algorithm for k-Means Clustering in Any Dimensions
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
Random knapsack in expected polynomial time
Journal of Computer and System Sciences - Special issue: STOC 2003
How Fast Is the k-Means Method?
Algorithmica
How slow is the k-means method?
Proceedings of the twenty-second annual symposium on Computational geometry
The Effectiveness of Lloyd-Type Methods for the k-Means Problem
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Clustering with Bregman Divergences
The Journal of Machine Learning Research
Average-Case and Smoothed Competitive Analysis of the Multilevel Feedback Algorithm
Mathematics of Operations Research
IEEE Transactions on Knowledge and Data Engineering
k-means++: the advantages of careful seeding
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Worst case and probabilistic analysis of the 2-Opt algorithm for the TSP: extended abstract
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Improved smoothed analysis of the k-means method
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Coresets and approximate clustering for Bregman divergences
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
The Planar k-Means Problem is NP-Hard
WALCOM '09 Proceedings of the 3rd International Workshop on Algorithms and Computation
NP-hardness of Euclidean sum-of-squares clustering
Machine Learning
Beyond Hirsch Conjecture: Walks on Random Polytopes and Smoothed Complexity of the Simplex Method
SIAM Journal on Computing
Worst-Case and Smoothed Analysis of the ICP Algorithm, with an Application to the k-Means Method
SIAM Journal on Computing
Worst-Case and Smoothed Analysis of k-Means Clustering with Bregman Divergences
ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
k-Means Has Polynomial Smoothed Complexity
FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
Clustering for metric and nonmetric distance measures
ACM Transactions on Algorithms (TALG)
k-means Requires Exponentially Many Iterations Even in the Plane
Discrete & Computational Geometry - Special Issue: 25th Annual Symposium on Computational Geometry; Guest Editor: John Hershberger
MFCS'12 Proceedings of the 37th international conference on Mathematical Foundations of Computer Science
The effectiveness of lloyd-type methods for the k-means problem
Journal of the ACM (JACM)
Nonlinear multicriteria clustering based on multiple dissimilarity matrices
Pattern Recognition
Optimising sum-of-squares measures for clustering multisets defined over a metric space
Discrete Applied Mathematics
Theoretical Computer Science
Scalable K-Means by ranked retrieval
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.01 |
The k-means method is one of the most widely used clustering algorithms, drawing its popularity from its speed in practice. Recently, however, it was shown to have exponential worst-case running time. In order to close the gap between practical performance and theoretical analysis, the k-means method has been studied in the model of smoothed analysis. But even the smoothed analyses so far are unsatisfactory as the bounds are still super-polynomial in the number n of data points. In this article, we settle the smoothed running time of the k-means method. We show that the smoothed number of iterations is bounded by a polynomial in n and 1/σ, where σ is the standard deviation of the Gaussian perturbations. This means that if an arbitrary input data set is randomly perturbed, then the k-means method will run in expected polynomial time on that input set.