A Randomized Algorithm for Principal Component Analysis

Authors:
Vladimir Rokhlin;Arthur Szlam;Mark Tygert
Affiliations:
vladimir.rokhlin@yale.edu;aszlam@math.ucla.edu and tygert@aya.yale.edu;-
Venue:
SIAM Journal on Matrix Analysis and Applications
Year:
2009

Citing 0
Cited 18

A sparse Johnson: Lindenstrauss transform

Proceedings of the forty-second ACM symposium on Theory of computing
Fast Algorithms for Approximating the Singular Value Decomposition

ACM Transactions on Knowledge Discovery from Data (TKDD)
CROS: A Contingency Response multi-agent system for Oil Spills situations

Applied Soft Computing
Stochastic algorithms in linear algebra: beyond the Markov chains and von Neumann-Ulam scheme

NMA'10 Proceedings of the 7th international conference on Numerical methods and applications
Blendenpik: Supercharging LAPACK's Least-Squares Solver

SIAM Journal on Scientific Computing
Subspace embeddings for the L1-norm with applications

Proceedings of the forty-third annual ACM symposium on Theory of computing
Original Article: Sparsified Randomization algorithms for low rank approximations and applications to integral equations and inhomogeneous random field simulation

Mathematics and Computers in Simulation
Scikit-learn: Machine Learning in Python

The Journal of Machine Learning Research
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

SIAM Review
An Algorithm for the Principal Component Analysis of Large Data Sets

SIAM Journal on Scientific Computing
Optimal bounds for Johnson-Lindenstrauss transforms and streaming problems with sub-constant error

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Low rank matrix-valued chernoff bounds and approximate matrix multiplication

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Randomized Algorithms for Matrices and Data

Foundations and Trends® in Machine Learning
Sampling methods for the Nyström method

The Journal of Machine Learning Research
A GPU-based approximate SVD algorithm

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
Cumulon: optimizing statistical data analysis in the cloud

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Optimal Bounds for Johnson-Lindenstrauss Transforms and Streaming Problems with Subconstant Error

ACM Transactions on Algorithms (TALG) - Special Issue on SODA'11
Fast approximation of matrix coherence and statistical leverage

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Principal component analysis (PCA) requires the computation of a low-rank approximation to a matrix containing the data being analyzed. In many applications of PCA, the best possible accuracy of any rank-deficient approximation is at most a few digits (measured in the spectral norm, relative to the spectral norm of the matrix being approximated). In such circumstances, efficient algorithms have not come with guarantees of good accuracy, unless one or both dimensions of the matrix being approximated are small. We describe an efficient algorithm for the low-rank approximation of matrices that produces accuracy that is very close to the best possible accuracy, for matrices of arbitrary sizes. We illustrate our theoretical results via several numerical examples.