Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix

Authors:
Petros Drineas;Ravi Kannan;Michael W. Mahoney
Affiliations:
-;-;-
Venue:
SIAM Journal on Computing
Year:
2006

Citing 0
Cited 54

Sampling algorithms for l2 regression and applications

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Tensor-CUR decompositions for tensor-based data

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Randomized algorithms for matrices and massive data sets

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Data streams: algorithms and applications

Foundations and Trends® in Theoretical Computer Science
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning

The Journal of Machine Learning Research
Sampling from large matrices: An approach through geometric functional analysis

Journal of the ACM (JACM)
Subspace sampling and relative-error matrix approximation: column-row-based methods

ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Sampling subproblems of heterogeneous Max-Cut problems and approximation algorithms

Random Structures & Algorithms
Clustered subset selection and its applications on it service metrics

Proceedings of the 17th ACM conference on Information and knowledge management
Approximate Spectral Clustering

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
A fast and efficient algorithm for low-rank approximation of a matrix

Proceedings of the forty-first annual ACM symposium on Theory of computing
On selecting a maximum volume sub-matrix of a matrix and related problems

Theoretical Computer Science
Spectral Algorithms

Foundations and Trends® in Theoretical Computer Science
Privacy-preserving similarity-based text retrieval

ACM Transactions on Internet Technology (TOIT)
Solving low-rank matrix completion problems efficiently

Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
Spectral methods for matrices and tensors

Proceedings of the forty-second ACM symposium on Theory of computing
Successively alternate least square for low-rank matrix factorization with bounded missing data

Computer Vision and Image Understanding
Approximate pairwise clustering for large data sets via sampling plus extension

Pattern Recognition
Clustered Nyström method for large scale manifold learning and dimension reduction

IEEE Transactions on Neural Networks
A Randomized Algorithm for Principal Component Analysis

SIAM Journal on Matrix Analysis and Applications
Fast Algorithms for Approximating the Singular Value Decomposition

ACM Transactions on Knowledge Discovery from Data (TKDD)
Fast construction of hierarchical matrix representation from matrix-vector multiplication

Journal of Computational Physics
Gossip PCA

Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Gossip PCA

ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
A spectral algorithm for computing social balance

WAW'11 Proceedings of the 8th international conference on Algorithms and models for the web graph
Fast PCA for processing calcium-imaging data from the brain of drosophila melanogaster

Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

SIAM Review
Importance Sampling for a Monte Carlo Matrix Multiplication Algorithm, with Application to Information Retrieval

SIAM Journal on Scientific Computing
Low rank matrix-valued chernoff bounds and approximate matrix multiplication

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Column subset selection via sparse approximation of SVD

Theoretical Computer Science
Subspace sampling and relative-error matrix approximation: column-based methods

APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Randomized Algorithms for Matrices and Data

Foundations and Trends® in Machine Learning
SCIHTBB: Sparsity constrained iterative hard thresholding with Barzilai-Borwein step size

Neurocomputing
Active spectral clustering via iterative uncertainty reduction

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Graph Sparsification by Effective Resistances

SIAM Journal on Computing
A Fast Algorithm for Fourier Continuation

SIAM Journal on Scientific Computing
Sampling methods for the Nyström method

The Journal of Machine Learning Research
Sampling techniques for monte carlo matrix multiplication with applications to image processing

MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
A fast tri-factorization method for low-rank matrix recovery and completion

Pattern Recognition
Approximation error in regularized SVD-based Fourier continuations

Applied Numerical Mathematics
Multi-level Low-rank Approximation-based Spectral Clustering for image segmentation

Pattern Recognition Letters
A Personalized Recommendation Algorithm Based on Approximating the Singular Value Decomposition (ApproSVD)

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Low rank approximation and regression in input sparsity time

Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Asymptotic error bounds for kernel-based Nyström low-rank approximation matrices

Journal of Multivariate Analysis
Iterative reweighted algorithms for matrix rank minimization

The Journal of Machine Learning Research
Reduced heteroscedasticity linear regression for Nyström approximation

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Anomaly detection in large-scale data stream networks

Data Mining and Knowledge Discovery
Low rank approximation of the symmetric positive semidefinite matrix

Journal of Computational and Applied Mathematics
An efficient matrix bi-factorization alternative optimization method for low-rank matrix recovery and completion

Neural Networks
Large-scale SVD and manifold learning

The Journal of Machine Learning Research
Matrix Recipes for Hard Thresholding Methods

Journal of Mathematical Imaging and Vision
Column Subset Selection Problem is UG-hard

Journal of Computer and System Sciences
Efficient eigen-updating for spectral graph clustering

Neurocomputing
Fast communication: Iterative partial matrix shrinkage algorithm for matrix rank minimization

Signal Processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

In many applications, the data consist of (or may be naturally formulated as) an $m \times n$ matrix $A$. It is often of interest to find a low-rank approximation to $A$, i.e., an approximation $D$ to the matrix $A$ of rank not greater than a specified rank $k$, where $k$ is much smaller than $m$ and $n$. Methods such as the singular value decomposition (SVD) may be used to find an approximation to $A$ which is the best in a well-defined sense. These methods require memory and time which are superlinear in $m$ and $n$; for many applications in which the data sets are very large this is prohibitive. Two simple and intuitive algorithms are presented which, when given an $m \times n$ matrix $A$, compute a description of a low-rank approximation $D^{*}$ to $A$, and which are qualitatively faster than the SVD. Both algorithms have provable bounds for the error matrix $A-D^{*}$. For any matrix $X$, let $\|{X}\|_F$ and $\|{X}\|_2$ denote its Frobenius norm and its spectral norm, respectively. In the first algorithm, $c$ columns of $A$ are randomly chosen. If the $m \times c$ matrix $C$ consists of those $c$ columns of $A$ (after appropriate rescaling), then it is shown that from $C^TC$ approximations to the top singular values and corresponding singular vectors may be computed. From the computed singular vectors a description $D^{*}$ of the matrix $A$ may be computed such that $\mathrm{rank}(D^{*}) \le k$ and such that $$ \left\|A-D^{*}\right\|_{\xi}^{2} \le \min_{D:\mathrm{rank}(D)\le k} \left\|A-D\right\|_{\xi}^{2} + poly(k,1/c) \left\|{A}\right\|^2_F $$ holds with high probability for both $\xi = 2,F$. This algorithm may be implemented without storing the matrix $A$ in random access memory (RAM), provided it can make two passes over the matrix stored in external memory and use $O(cm+c^2)$ additional RAM. The second algorithm is similar except that it further approximates the matrix $C$ by randomly sampling $r$ rows of $C$ to form a $r \times c$ matrix $W$. Thus, it has additional error, but it can be implemented in three passes over the matrix using only constant additional RAM. To achieve an additional error (beyond the best rank $k$ approximation) that is at most $\epsilon\|{A}\|^2_F$, both algorithms take time which is polynomial in $k$, $1/\epsilon$, and $\log(1/\delta)$, where $\delta0$ is a failure probability; the first takes time linear in $\mbox{max}(m,n)$ and the second takes time independent of $m$ and $n$. Our bounds improve previously published results with respect to the rank parameter $k$ for both the Frobenius and spectral norms. In addition, the proofs for the error bounds use a novel method that makes important use of matrix perturbation theory. The probability distribution over columns of $A$ and the rescaling are crucial features of the algorithms which must be chosen judiciously.