Approximation schemes for Euclidean k-medians and related problems
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
On approximating arbitrary metrices by tree metrics
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
A threshold of ln n for approximating set cover
Journal of the ACM (JACM)
A constant-factor approximation algorithm for the k-median problem (extended abstract)
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Greedy strikes back: improved facility location algorithms
Journal of Algorithms
Approximating min-sum k-clustering in metric spaces
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Local search heuristic for k-median and facility location problems
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
A new greedy approach for facility location problems
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Computers and Intractability; A Guide to the Theory of NP-Completeness
Computers and Intractability; A Guide to the Theory of NP-Completeness
Approximation schemes for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
A Simple Linear Time (1+ ") -Approximation Algorithm for k-Means Clustering in Any Dimensions
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
Metric Embeddings with Relaxed Guarantees
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
The uniqueness of a good optimum for K-means
ICML '06 Proceedings of the 23rd international conference on Machine learning
The Effectiveness of Lloyd-Type Methods for the k-Means Problem
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Optimal hierarchical decompositions for congestion minimization in networks
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
A discriminative framework for clustering via similarity functions
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Approximate clustering without the approximation
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Stability of k-means clustering
COLT'07 Proceedings of the 20th annual conference on Learning theory
Stability Yields a PTAS for k-Median and k-Means Clustering
FOCS '10 Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
A sober look at clustering stability
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Clustering under perturbation resilience
ICALP'12 Proceedings of the 39th international colloquium conference on Automata, Languages, and Programming - Volume Part I
Data stability in clustering: a closer look
ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Clustering under approximation stability
Journal of the ACM (JACM)
Hi-index | 0.89 |
Clustering under most popular objective functions is NP-hard, even to approximate well, and so unlikely to be efficiently solvable in the worst case. Recently, Bilu and Linial (2010) [11] suggested an approach aimed at bypassing this computational barrier by using properties of instances one might hope to hold in practice. In particular, they argue that instances in practice should be stable to small perturbations in the metric space and give an efficient algorithm for clustering instances of the Max-Cut problem that are stable to perturbations of size O(n^1^/^2). In addition, they conjecture that instances stable to as little as O(1) perturbations should be solvable in polynomial time. In this paper we prove that this conjecture is true for any center-based clustering objective (such as k-median, k-means, and k-center). Specifically, we show we can efficiently find the optimal clustering assuming only stability to factor-3 perturbations of the underlying metric in spaces without Steiner points, and stability to factor 2+3 perturbations for general metrics. In particular, we show for such instances that the popular Single-Linkage algorithm combined with dynamic programming will find the optimal clustering. We also present NP-hardness results under a weaker but related condition.