Low rank approximation and regression in input sparsity time

  • Authors:
  • Kenneth L. Clarkson;David P. Woodruff

  • Affiliations:
  • IBM, San Jose, CA, USA;IBM, San Jose, CA, USA

  • Venue:
  • Proceedings of the forty-fifth annual ACM symposium on Theory of computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We design a new distribution over poly(r ε-1) x n matrices S so that for any fixed n x d matrix A of rank r, with probability at least 9/10, SAx2 = (1 pm ε)Ax2 simultaneously for all x ∈ Rd. Such a matrix S is called a subspace embedding. Furthermore, SA can be computed in O(nnz(A)) + ~O(r2ε-2) time, where nnz(A) is the number of non-zero entries of A. This improves over all previous subspace embeddings, which required at least Ω(nd log d) time to achieve this property. We call our matrices S sparse embedding matrices. Using our sparse embedding matrices, we obtain the fastest known algorithms for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and lp-regression: to output an x' for which Ax'-b2 ≤ (1+ε)minx Ax-b2 for an n x d matrix A and an n x 1 column vector b, we obtain an algorithm running in O(nnz(A)) + ~O(d3ε-2) time, and another in O(nnz(A)log(1/ε)) + ~O(d3log(1/ε)) time. (Here ~O(f) = f ⋅ logO(1)(f).) to obtain a decomposition of an n x n matrix A into a product of an n x k matrix L, a k x k diagonal matrix D, and a n x k matrix W, for which F{A - L D W} ≤ (1+ε)F{A-Ak}, where Ak is the best rank-k approximation, our algorithm runs in O(nnz(A)) + ~O(nk2 ε-4log n + k3ε-5log2n) time. to output an approximation to all leverage scores of an n x d input matrix A simultaneously, with constant relative error, our algorithms run in O(nnz(A) log n) + ~O(r3) time. to output an x' for which Ax'-bp ≤ (1+ε)minx Ax-bp for an n x d matrix A and an n x 1 column vector b, we obtain an algorithm running in O(nnz(A) log n) + poly(r ε-1) time, for any constant 1 ≤ p