Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity

  • Authors:
  • Richard Matthew Mccutchen;Samir Khuller

  • Affiliations:
  • University of Maryland,;University of Maryland,

  • Venue:
  • APPROX '08 / RANDOM '08 Proceedings of the 11th international workshop, APPROX 2008, and 12th international workshop, RANDOM 2008 on Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is a common problem in the analysis of large data sets. Streamingalgorithms, which make a single pass over the data set using small working memory and produce a clustering comparable in cost to the optimal offline solution, are especially useful. We develop the first streaming algorithms achieving a constant-factor approximation to the cluster radius for two variations of the k-center clustering problem. We give a streaming (4 + 茂戮驴)-approximation algorithm using O(茂戮驴茂戮驴 1kz) memory for the problem with outliers, in which the clustering is allowed to drop up to zof the input points; previous work used a random sampling approach which yields only a bicriteria approximation. We also give a streaming (6 + 茂戮驴)-approximation algorithm using O(茂戮驴茂戮驴 1ln (茂戮驴茂戮驴 1) k+ k2) memory for a variation motivated by anonymity considerations in which each cluster must contain at least a certain number of input points.