Optimal algorithms for approximate clustering
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
e-approximations with minimum packing constraint violation (extended abstract)
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Incremental clustering and dynamic information retrieval
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximation schemes for Euclidean k-medians and related problems
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Sublinear time algorithms for metric space problems
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Accelerating exact k-means algorithms with geometric reasoning
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Density biased sampling: an improved method for data mining and clustering
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
K-medians, facility location, and the Chernoff-Wald bound
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Sublinear time approximate clustering
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Algorithms for facility location problems with outliers
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Clustering to minimize the sum of cluster diameters
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Local search heuristic for k-median and facility location problems
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Clustering Algorithms
Scaling mining algorithms to large databases
Communications of the ACM - Evolving data mining into solutions for insights
A constant-factor approximation algorithm for the k-median problem
Journal of Computer and System Sciences - STOC 1999
Criteria for Polynomial-Time (Conceptual) Clustering
Machine Learning
FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
Improved Combinatorial Algorithms for the Facility Location and k-Median Problems
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Learning Mixtures of Gaussians
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Primal-Dual Approximation Algorithms for Metric Facility Location and k-Median Problems
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
Some Extensions of the K-Means Algorithm for Image Segmentation and Pattern Classification
Some Extensions of the K-Means Algorithm for Image Segmentation and Pattern Classification
Approximation algorithms for facility location problems
Approximation algorithms for facility location problems
Greedy facility location algorithms analyzed using dual fitting with factor-revealing LP
Journal of the ACM (JACM)
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Optimal time bounds for approximate clustering
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Algorithms for K-means clustering problem with balancing constraint
CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Small space representations for metric min-sum k-clustering and their applications
STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Approximation algorithms for k-modes clustering
ICIC'06 Proceedings of the 2006 international conference on Intelligent computing: Part II
Property testing
Property testing
Fast k-means algorithms with constant approximation
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Hi-index | 0.00 |
We give a sampling-based algorithm for the k-Median problem, with running time O(k(\frac{k^2}{\epsilon} log k)2 log(\frac{k}{\epsilon} log k)), where k is the desired number of clusters and ε is a confidence parameter. This is the first k-Median algorithm with fully polynomial running time that is independent of n, the size of the data set. It gives a solution that is, with high probability, an O(1)-approximation, if each cluster in some optimal solution has Ω(\frac{n\epsilon}{k}) points. We also give weakly-polynomial-time algorithms for this problem and a relaxed version of k-Median in which a small fraction of outliers can be excluded. We give near-matching lower bounds showing that this assumption about cluster size is necessary. We also present a related algorithm for finding a clustering that excludes a small number of outliers.