Approximation schemes for Euclidean k-medians and related problems
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
A constant-factor approximation algorithm for the k-median problem (extended abstract)
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Sublinear time algorithms for metric space problems
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Greedy strikes back: improved facility location algorithms
Journal of Algorithms
Clustering for edge-cost minimization (extended abstract)
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Sublinear time approximate clustering
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Approximating min-sum k-clustering in metric spaces
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
A new greedy approach for facility location problems
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Approximation schemes for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Improved Combinatorial Algorithms for the Facility Location and k-Median Problems
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Learning Mixtures of Gaussians
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Local Search Heuristics for k-Median and Facility Location Problems
SIAM Journal on Computing
A spectral algorithm for learning mixture models
Journal of Computer and System Sciences - Special issue on FOCS 2002
A Simple Linear Time (1+ ") -Approximation Algorithm for k-Means Clustering in Any Dimensions
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
The uniqueness of a good optimum for K-means
ICML '06 Proceedings of the 23rd international conference on Machine learning
Worst-case and Smoothed Analysis of the ICP Algorithm, with an Application to the k-means Method
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
The Effectiveness of Lloyd-Type Methods for the k-Means Problem
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
A discriminative framework for clustering via similarity functions
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
The spectral method for general mixture models
COLT'05 Proceedings of the 18th annual conference on Learning Theory
On spectral learning of mixtures of distributions
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Correlation Clustering Revisited: The "True" Cost of Error Minimization Problems
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
Clustering with or without the approximation
COCOON'10 Proceedings of the 16th annual international conference on Computing and combinatorics
On nash-equilibria of approximation-stable games
SAGT'10 Proceedings of the Third international conference on Algorithmic game theory
On the complexity of the metric TSP under stability considerations
SOFSEM'11 Proceedings of the 37th international conference on Current trends in theory and practice of computer science
Min-sum clustering of protein sequences with limited distance information
SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition
Center-based clustering under perturbation stability
Information Processing Letters
Streaming k-means on well-clusterable data
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Active clustering of biological sequences
The Journal of Machine Learning Research
Approximation algorithms for semi-random partitioning problems
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
The effectiveness of lloyd-type methods for the k-means problem
Journal of the ACM (JACM)
A framework for evaluating the smoothness of data-mining results
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Data stability in clustering: a closer look
ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Clustering under approximation stability
Journal of the ACM (JACM)
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Decompositions of triangle-dense graphs
Proceedings of the 5th conference on Innovations in theoretical computer science
Hi-index | 0.00 |
Approximation algorithms for clustering points in metric spaces is a flourishing area of research, with much research effort spent on getting a better understanding of the approximation guarantees possible for many objective functions such as k-median, k-means, and min-sum clustering. This quest for better approximation algorithms is further fueled by the implicit hope that these better approximations also yield more accurate clusterings. E.g., for many problems such as clustering proteins by function, or clustering images by subject, there is some unknown correct "target" clustering and the implicit hope is that approximately optimizing these objective functions will in fact produce a clustering that is close pointwise to the truth. In this paper, we show that if we make this implicit assumption explicit---that is, if we assume that any c-approximation to the given clustering objective φ is ε-close to the target---then we can produce clusterings that are O(ε)-close to the target, even for values c for which obtaining a c-approximation is NP-hard. In particular, for k-median and k-means objectives, we show that we can achieve this guarantee for any constant c 1, and for the min-sum objective we can do this for any constant c 2. Our results also highlight a surprising conceptual difference between assuming that the optimal solution to, say, the k-median objective is ε-close to the target, and assuming that any approximately optimal solution is ε-close to the target, even for approximation factor say c = 1.01. In the former case, the problem of finding a solution that is O(ε)-close to the target remains computationally hard, and yet for the latter we have an efficient algorithm.