Numerical linear algebra in the streaming model

  • Authors:
  • Kenneth L. Clarkson;David P. Woodruff

  • Affiliations:
  • IBM Almaden, San Jose, CA, USA;IBM Almaden, San Jose, CA, USA

  • Venue:
  • Proceedings of the forty-first annual ACM symposium on Theory of computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We give near-optimal space bounds in the streaming model for linear algebra problems that include estimation of matrix products, linear regression, low-rank approximation, and approximation of matrix rank. In the streaming model, sketches of input matrices are maintained under updates of matrix entries; we prove results for turnstile updates, given in an arbitrary order. We give the first lower bounds known for the space needed by the sketches, for a given estimation error ε. We sharpen prior upper bounds, with respect to combinations of space, failure probability, and number of passes. The sketch we use for matrix A is simply STA, where S is a sign matrix. Our results include the following upper and lower bounds on the bits of space needed for 1-pass algorithms. Here A is an n x d matrix, B is an n x d' matrix, and c := d+d'. These results are given for fixed failure probability; for failure probability δ0, the upper bounds require a factor of log(1/δ) more space. We assume the inputs have integer entries specified by O(log(nc)) bits, or O(log(nd)) bits. (Matrix Product) Output matrix C with F(ATB-C) ≤ ε F(A) F(B). We show that Θ(cε-2log(nc)) space is needed. (Linear Regression) For d'=1, so that B is a vector b, find x so that Ax-b ≤ (1+ε) minx' ∈ Reald Ax'-b. We show that Θ(d2ε-1 log(nd)) space is needed. (Rank-k Approximation) Find matrix tAk of rank no more than k, so that F(A-tAk) ≤ (1+ε) F{A-Ak}, where Ak is the best rank-k approximation to A. Our lower bound is Ω(kε-1(n+d)log(nd)) space, and we give a one-pass algorithm matching this when A is given row-wise or column-wise. For general updates, we give a one-pass algorithm needing [O(kε-2(n + d/ε2)log(nd))] space. We also give upper and lower bounds for algorithms using multiple passes, and a sketching analog of the CUR decomposition.