Subspace sampling and relative-error matrix approximation: column-row-based methods

Authors:
Petros Drineas;Michael W. Mahoney;S. Muthukrishnan
Affiliations:
Department of Computer Science, RPI;Yahoo Research Labs;Department of Computer Science, Rutgers University
Venue:
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Year:
2006

Citing 7
Cited 5

Matrix analysis

Matrix analysis
Matrix approximation and projective clustering via volume sampling

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Sampling algorithms for l2 regression and applications

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication

SIAM Journal on Computing
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix

SIAM Journal on Computing
Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition

SIAM Journal on Computing
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning

The Journal of Machine Learning Research

Matrix row-column sampling for the many-light problem

ACM SIGGRAPH 2007 papers
Unsupervised feature selection for principal components analysis

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Interpretable nonnegative matrix decompositions

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Algorithms for Approximating the Singular Value Decomposition

ACM Transactions on Knowledge Discovery from Data (TKDD)
Low rank approximation and regression in input sparsity time

Proceedings of the forty-fifth annual ACM symposium on Theory of computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Much recent work in the theoretical computer science, linear algebra, and machine learning has considered matrix decompositions of the following form: given an m × n matrix A, decompose it as a product of three matrices, C, U, and R, where C consists of a small number of columns of A, R consists of a small number of rows of A, and U is a small carefully constructed matrix that guarantees that the product CUR is "close" to A. Applications of such decompositions include the computation of matrix "sketches", speeding up kernel-based statistical learning, preserving sparsity in low-rank matrix representation, and improved interpretability of data analysis methods. Our main result is a randomized, polynomial algorithm which, given as input an m × n matrix A, returns as output matrices C, U, R such that ||A-CUR||F ≤ (1+ε)|| A-Ak||F with probability at least 1-δ. Here, Ak is the "best" rank-k approximation (provided by truncating the Singular Value Decomposition of A), and ||X||F is the Frobenius norm of the matrix X. The number of columns in C and rows in R is a low-degree polynomial in k, 1/ε, and log(1/δ). Our main result is obtained by an extension of our recent relative error approximation algorithm for l2 regression from overconstrained problems to general l2 regression problems. Our algorithm is simple, and it takes time of the order of the time needed to compute the top k right singular vectors of A. In addition, it samples the columns and rows of A via the method of "subspace sampling," so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors, and since they ensure that we capture entirely a certain subspace of interest.