Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

Authors:
N. Halko;P. G. Martinsson;J. A. Tropp
Affiliations:
nathan.halko@colorado.edu and mastinss@colorado.edu;-;jtropp@acm.caltech.edu
Venue:
SIAM Review
Year:
2011

Citing 69
Cited 23

Matrix analysis

Matrix analysis
Estimating the largest eigenvalues by the power and Lanczos algorithms with a random start

SIAM Journal on Matrix Analysis and Applications
Randomized algorithms

Randomized algorithms
The space complexity of approximating the frequency moments

STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Efficient algorithms for computing a strong rank-revealing QR factorization

SIAM Journal on Scientific Computing
Matrix computations (3rd ed.)

Matrix computations (3rd ed.)
Two algorithms for nearest-neighbor search in high dimensions

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Latent semantic indexing: a probabilistic analysis

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
EM algorithms for PCA and SPCA

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Tracking join and self-join sizes in limited storage

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering in large graphs and matrices

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Random Sampling in Cut, Flow, and Network Design Problems

Mathematics of Operations Research
Minimum cuts in near-linear time

Journal of the ACM (JACM)
The FERET Evaluation Methodology for Face-Recognition Algorithms

IEEE Transactions on Pattern Analysis and Machine Intelligence
Latent semantic indexing: a probabilistic analysis

Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
Lectures on Discrete Geometry

Lectures on Discrete Geometry
Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces

SIAM Journal on Computing
Guest Editors' Introduction: The Top 10 Algorithms

Computing in Science and Engineering
The Decompositional Approach to Matrix Computation

Computing in Science and Engineering
The Metropolis Algorithm

Computing in Science and Engineering
Fast Monte-Carlo Algorithms for finding low-rank approximations

FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Database-friendly random projections: Johnson-Lindenstrauss with binary coins

Journal of Computer and System Sciences - Special issu on PODS 2001
Construction and arithmetics of H-matrices

Computing
Using randomization to make recursive matrix algorithms practical

Journal of Functional Programming
Clustering Large Graphs via the Singular Value Decomposition

Machine Learning
Spectral methods for data analysis

Spectral methods for data analysis
Fast monte-carlo algorithms for finding low-rank approximations

Journal of the ACM (JACM)
On the Compression of Low Rank Matrices

SIAM Journal on Scientific Computing
Subgradient and sampling algorithms for l1 regression

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Condition Numbers of Gaussian Random Matrices

SIAM Journal on Matrix Analysis and Applications
Matrix approximation and projective clustering via volume sampling

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform

Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication

SIAM Journal on Computing
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix

SIAM Journal on Computing
Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition

SIAM Journal on Computing
Improved Approximation Algorithms for Large Matrices via Random Projections

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning

The Journal of Machine Learning Research
Fast computation of low-rank matrix approximations

Journal of the ACM (JACM)
Sampling from large matrices: An approach through geometric functional analysis

Journal of the ACM (JACM)
Efficient subspace approximation algorithms

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Fast linear algebra is stable

Numerische Mathematik
Less is More: Sparse Graph Mining with Compact Matrix Decomposition

Statistical Analysis and Data Mining
Fast dimension reduction using Rademacher series on dual BCH codes

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Graph sparsification by effective resistances

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Unsupervised feature selection for principal components analysis

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Numerical Recipes 3rd Edition: The Art of Scientific Computing

Numerical Recipes 3rd Edition: The Art of Scientific Computing
Dense Fast Random Projections and Lean Walsh Transforms

APPROX '08 / RANDOM '08 Proceedings of the 11th international workshop, APPROX 2008, and 12th international workshop, RANDOM 2008 on Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques
Regularization on Graphs with Function-adapted Diffusion Processes

The Journal of Machine Learning Research
Relative-Error $CUR$ Matrix Decompositions

SIAM Journal on Matrix Analysis and Applications
An improved approximation algorithm for the column subset selection problem

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Numerical linear algebra in the streaming model

Proceedings of the forty-first annual ACM symposium on Theory of computing
A fast and efficient algorithm for low-rank approximation of a matrix

Proceedings of the forty-first annual ACM symposium on Theory of computing
On selecting a maximum volume sub-matrix of a matrix and related problems

Theoretical Computer Science
Sampling Algorithms and Coresets for $\ell_p$ Regression

SIAM Journal on Computing
Exact Matrix Completion via Convex Optimization

Foundations of Computational Mathematics
Accelerated dense random projections

Accelerated dense random projections
A Randomized Algorithm for Principal Component Analysis

SIAM Journal on Matrix Analysis and Applications
The power of convex relaxation: near-optimal matrix completion

IEEE Transactions on Information Theory
Faster least squares approximation

Numerische Mathematik
Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization

SIAM Review
An Algorithm for the Principal Component Analysis of Large Data Sets

SIAM Journal on Scientific Computing
A fast random sampling algorithm for sparsifying matrices

APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Adaptive sampling and fast low-rank matrix approximation

APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Subspace sampling and relative-error matrix approximation: column-based methods

APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Efficient agnostic learning of neural networks with bounded fan-in

IEEE Transactions on Information Theory - Part 2
Data compression and harmonic analysis

IEEE Transactions on Information Theory
Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

IEEE Transactions on Information Theory
Compressed sensing

IEEE Transactions on Information Theory

Importance Sampling for a Monte Carlo Matrix Multiplication Algorithm, with Application to Information Retrieval

SIAM Journal on Scientific Computing
An Algorithm for the Principal Component Analysis of Large Data Sets

SIAM Journal on Scientific Computing
Exact matrix completion via convex optimization

Communications of the ACM
Randomized Algorithms for Matrices and Data

Foundations and Trends® in Machine Learning
FaIMS: A fast algorithm for the inverse medium problem with multiple frequencies and multiple sources for the scalar Helmholtz equation

Journal of Computational Physics
Beating randomized response on incoherent matrices

STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Matrix Probing and its Conditioning

SIAM Journal on Numerical Analysis
A Fast Randomized Algorithm for Computing a Hierarchically Semiseparable Representation of a Matrix

SIAM Journal on Matrix Analysis and Applications
Extreme-scale UQ for Bayesian inverse problems governed by PDEs

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Self-Avoiding Random Dynamics on Integer Complex Systems

ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special Issue on Monte Carlo Methods in Statistics
Active subspace: Toward scalable low-rank learning

Neural Computation
Randomized SVD methods in hyperspectral imaging

Journal of Electrical and Computer Engineering - Special issue on Algorithms for Multispectral and Hyperspectral Image Analysis
Scaling matrix factorization for recommendation with randomness

Proceedings of the 22nd international conference on World Wide Web companion
Scalable supervised dimensionality reduction using clustering

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Iterative reweighted algorithms for matrix rank minimization

The Journal of Machine Learning Research
Fast approximation of matrix coherence and statistical leverage

The Journal of Machine Learning Research
Inverse bi-scale material design

ACM Transactions on Graphics (TOG)
A scalable approach to column-based low-rank matrix approximation

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Geometrical and computational aspects of Spectral Support Estimation for novelty detection

Pattern Recognition Letters
Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling

The Journal of Machine Learning Research
Matrix Recipes for Hard Thresholding Methods

Journal of Mathematical Imaging and Vision
Efficient eigen-updating for spectral graph clustering

Neurocomputing
Fast communication: Iterative partial matrix shrinkage algorithm for matrix rank minimization

Signal Processing

Quantified Score

Hi-index	0.02

Visualization

Abstract

Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the $k$ dominant components of the singular value decomposition of an $m \times n$ matrix. (i) For a dense input matrix, randomized algorithms require $\bigO(mn \log(k))$ floating-point operations (flops) in contrast to $ \bigO(mnk)$ for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to $\bigO(k)$ passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.