A near-linear algorithm for projective clustering integer points

  • Authors:
  • Kasturi Varadarajan;Xin Xiao

  • Affiliations:
  • University of Iowa, Iowa City, IA;University of Iowa, Iowa City, IA

  • Venue:
  • Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of projective clustering in Euclidean spaces of non-fixed dimension. Here, we are given a set P of n points in Rm and integers j ≥ 1, k ≥ 0, and the goal is to find j k-subspaces so that the sum of the distances of each point in P to the nearest subspace is minimized. Observe that this is a shape fitting problem where we wish to find the best fit in the L1 sense. Here we will treat the number j of subspaces we want to fit and the dimension k of each of them as constants. We consider instances of projective clustering where the point coordinates are integers of magnitude polynomial in m and n. Our main result is a randomized algorithm that for any ε 0 runs in time O(mn polylog(mn)) and outputs a solution that with high probability is within (1 + ε) of the optimal solution. To obtain this result, we show that the fixed dimensional version of the above projective clustering problem has a small coreset. We do that by observing that in a fairly general sense, shape fitting problems that have small coresets in the L∞ setting also have small coresets in the L1 setting, and then exploiting an existing construction for the L∞ setting. This observation seems to be quite useful for other shape fitting problems as well, as we demonstrate by constructing the first "regular" coreset for the circle fitting problem in the plane.