Algorithms for clustering data
Algorithms for clustering data
Information retrieval
Incremental clustering and dynamic information retrieval
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Algorithms for facility location problems with outliers
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Approximate clustering via core-sets
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Clustering Algorithms
Better streaming algorithms for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Robust shape fitting via peeling and grating coresets
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Achieving anonymity via clustering
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A constant factor approximation algorithm for k-median clustering with outliers
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Cluster Analysis
Fast clustering using MapReduce
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Matroid and knapsack center problems
IPCO'13 Proceedings of the 16th international conference on Integer Programming and Combinatorial Optimization
Streaming with minimum space: An algorithm for covering by two congruent balls
Theoretical Computer Science
Hi-index | 0.00 |
Clustering is a common problem in the analysis of large data sets. Streamingalgorithms, which make a single pass over the data set using small working memory and produce a clustering comparable in cost to the optimal offline solution, are especially useful. We develop the first streaming algorithms achieving a constant-factor approximation to the cluster radius for two variations of the k-center clustering problem. We give a streaming (4 + 茂戮驴)-approximation algorithm using O(茂戮驴茂戮驴 1kz) memory for the problem with outliers, in which the clustering is allowed to drop up to zof the input points; previous work used a random sampling approach which yields only a bicriteria approximation. We also give a streaming (6 + 茂戮驴)-approximation algorithm using O(茂戮驴茂戮驴 1ln (茂戮驴茂戮驴 1) k+ k2) memory for a variation motivated by anonymity considerations in which each cluster must contain at least a certain number of input points.