Fitting algebraic curves to noisy data

  • Authors:
  • Sanjeev Arora;Subhash Khot

  • Affiliations:
  • Princeton University, Princeton, NJ;Princeton University, Princeton, NJ

  • Venue:
  • STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

(MATH) Motivated by applications in vision and pattern detection, we introduce the following problem. We are given pairs of datapoints $(x_1, y_1)$, $(x_2, y_2)$, $\ldots,(x_m, y_m)$, a noise parameter $\delta 0$, a degree bound $d$, and a threshold $\rho0$. We desire "every" degree $d$ polynomial $h$ satisfying h(x_i) \in [y_i -\delta, y_i +\delta] & \qquad \nonumber for at least &rgr; fraction of i's.(MATH) We assume by rescaling the data that each $x_i, y_i \in [-1, 1]$.(MATH) If $\delta =0$, this is just the list decoding problem that has been popular in complexity theory and for which Sudan gave a $\poly(d,1/\rho)$ time algorithm.We show a few basic results about the problem. We show that there is no polynomial time algorithm for this problem as defined; the number of solutions can be as large as exp(d0.5 -&egr;) even if the data is generated using a 50-50 mixture of two polynomials. We give a rigorous analysis of a brute force algorithm for the version of this problem where the data is generated from a mixture of polynomials. Finally, in surprising contrast to our "lower bound", we describe a polynomial-time algorithm for reconstructing mixtures of O(1) polynomials when the mixing weights are "nondegenerate.The tools used include classical theory of approximations.