International Journal of Computer Vision
SCG '94 Proceedings of the tenth annual symposium on Computational geometry
Efficient and effective querying by image content
Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
Clustering in large graphs and matrices
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
On coresets for k-means and k-median clustering
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
A Simple Linear Time (1+ ") -Approximation Algorithm for k-Means Clustering in Any Dimensions
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
Hi-index | 0.00 |
Matousek [Discrete Comput. Geom. 24 (1) (2000) 61-84] designed an O(nlogn) deterministic algorithm for the approximate 2-means clustering problem for points in fixed dimensional Euclidean space which had left open the possibility of a linear time algorithm. In this paper, we present a simple randomized algorithm to determine an approximate 2-means clustering of a given set of points in fixed dimensional Euclidean space, with constant probability, in linear time.We first approximate the mean of the larger cluster using random sampling. We then show that the problem can be reduced to a set of lines, on which it can be solved by carefully pruning away points of the larger cluster and randomly sampling on the remaining points to obtain an approximate to the mean of the smaller cluster.