Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression

Authors:
Xiangrui Meng;Michael W. Mahoney
Affiliations:
LinkedIn Corporation, Mountain View, CA, USA;Stanford University, Stanford, CA, USA
Venue:
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Year:
2013

Citing 12
Cited 3

Dimension Reduction in the \ell _1 Norm

FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
Subgradient and sampling algorithms for l1 regression

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
On the impossibility of dimension reduction in l1

Journal of the ACM (JACM)
Sampling algorithms for l2 regression and applications

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Improved Approximation Algorithms for Large Matrices via Random Projections

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Sampling Algorithms and Coresets for $\ell_p$ Regression

SIAM Journal on Computing
A sparse Johnson: Lindenstrauss transform

Proceedings of the forty-second ACM symposium on Theory of computing
Blendenpik: Supercharging LAPACK's Least-Squares Solver

SIAM Journal on Scientific Computing
Faster least squares approximation

Numerische Mathematik
Subspace embeddings for the L1-norm with applications

Proceedings of the forty-third annual ACM symposium on Theory of computing
An almost optimal unrestricted fast Johnson-Lindenstrauss transform

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Randomized Algorithms for Matrices and Data

Randomized Algorithms for Matrices and Data

Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform

Proceedings of the 32nd symposium on Principles of database systems
Sparsity lower bounds for dimensionality reducing maps

Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Sparser Johnson-Lindenstrauss Transforms

Journal of the ACM (JACM)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Low-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for common linear algebra problems. We show that, given a matrix A ∈ Rn x d with n d and a p ∈ [1, 2), with a constant probability, we can construct a low-distortion embedding matrix Π ∈ RO(poly(d)) x n that embeds Ap, the lp subspace spanned by A's columns, into (RO(poly(d)), |~cdot~|p); the distortion of our embeddings is only O(poly(d)), and we can compute Π A in O(nnz(A)) time, i.e., input-sparsity time. Our result generalizes the input-sparsity time l2 subspace embedding by Clarkson and Woodruff [STOC'13]; and for completeness, we present a simpler and improved analysis of their construction for l2. These input-sparsity time lp embeddings are optimal, up to constants, in terms of their running time; and the improved running time propagates to applications such as (1 pm ε)-distortion lp subspace embedding and relative-error lp regression. For l2, we show that a (1+ε)-approximate solution to the l2 regression problem specified by the matrix A and a vector b ∈ Rn can be computed in O(nnz(A) + d3 log(d/ε) /ε^2) time; and for lp, via a subspace-preserving sampling procedure, we show that a (1 pm ε)-distortion embedding of Ap into RO(poly(d)) can be computed in O(nnz(A) ⋅ log n) time, and we also show that a (1+ε)-approximate solution to the lp regression problem minx ∈ Rd |A x - b|p can be computed in O(nnz(A) ⋅ log n + poly(d) log(1/ε)/ε2) time. Moreover, we can also improve the embedding dimension or equivalently the sample size to O(d3+p/2 log(1/ε) / ε2) without increasing the complexity.