Sampling algorithms for l2 regression and applications
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Tensor-CUR decompositions for tensor-based data
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Randomized algorithms for matrices and massive data sets
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
The Journal of Machine Learning Research
Sampling from large matrices: An approach through geometric functional analysis
Journal of the ACM (JACM)
Subspace sampling and relative-error matrix approximation: column-row-based methods
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Sampling subproblems of heterogeneous Max-Cut problems and approximation algorithms
Random Structures & Algorithms
CRD: fast co-clustering on large datasets utilizing sampling-based matrix decomposition
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Real-time automatic tag recommendation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Colibri: fast mining of large static and dynamic graphs
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Numerical linear algebra in the streaming model
Proceedings of the forty-first annual ACM symposium on Theory of computing
On selecting a maximum volume sub-matrix of a matrix and related problems
Theoretical Computer Science
Foundations and Trends® in Theoretical Computer Science
SSDE: fast graph drawing using sampled spectral distance embedding
GD'06 Proceedings of the 14th international conference on Graph drawing
A secure multiparty computation privacy preserving OLAP framework over distributed XML data
Proceedings of the 2010 ACM Symposium on Applied Computing
Spectral methods for matrices and tensors
Proceedings of the forty-second ACM symposium on Theory of computing
A Randomized Algorithm for Principal Component Analysis
SIAM Journal on Matrix Analysis and Applications
Enhancing Clustering Quality through Landmark-Based Dimensionality Reduction
ACM Transactions on Knowledge Discovery from Data (TKDD)
Fast construction of hierarchical matrix representation from matrix-vector multiplication
Journal of Computational Physics
Journal of Computer and System Sciences
Larger residuals, less work: active document scheduling for latent dirichlet allocation
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Descriptive matrix factorization for sustainability Adopting the principle of opposites
Data Mining and Knowledge Discovery
Low rank matrix-valued chernoff bounds and approximate matrix multiplication
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Subspace sampling and relative-error matrix approximation: column-based methods
APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Randomized Algorithms for Matrices and Data
Foundations and Trends® in Machine Learning
Low-Rank matrix factorization and co-clustering algorithms for analyzing large data sets
ICDEM'10 Proceedings of the Second international conference on Data Engineering and Management
Non-negative residual matrix factorization: problem definition, fast solutions, and applications
Statistical Analysis and Data Mining
Towards a theory for privacy preserving distributed OLAP
Proceedings of the 2012 Joint EDBT/ICDT Workshops
A Fast Algorithm for Fourier Continuation
SIAM Journal on Scientific Computing
Sampling techniques for monte carlo matrix multiplication with applications to image processing
MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
Approximation error in regularized SVD-based Fourier continuations
Applied Numerical Mathematics
Multi-level Low-rank Approximation-based Spectral Clustering for image segmentation
Pattern Recognition Letters
ParCube: sparse parallelizable tensor decompositions
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Low rank approximation and regression in input sparsity time
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Asymptotic error bounds for kernel-based Nyström low-rank approximation matrices
Journal of Multivariate Analysis
A scalable approach to column-based low-rank matrix approximation
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling
The Journal of Machine Learning Research
Column Subset Selection Problem is UG-hard
Journal of Computer and System Sciences
Hi-index | 0.01 |
In many applications, the data consist of (or may be naturally formulated as) an $m \times n$ matrix $A$ which may be stored on disk but which is too large to be read into random access memory (RAM) or to practically perform superlinear polynomial time computations on it. Two algorithms are presented which, when given an $m \times n$ matrix $A$, compute approximations to $A$ which are the product of three smaller matrices, $C$, $U$, and $R$, each of which may be computed rapidly. Let $A' = CUR$ be the computed approximate decomposition; both algorithms have provable bounds for the error matrix $A-A'$. In the first algorithm, $c$ columns of $A$ and $r$ rows of $A$ are randomly chosen. If the $m \times c$ matrix $C$ consists of those $c$ columns of $A$ (after appropriate rescaling) and the $r \times n$ matrix $R$ consists of those $r$ rows of $A$ (also after appropriate rescaling), then the $c \times r$ matrix $U$ may be calculated from $C$ and $R$. For any matrix $X$, let $\|X\|_F$ and $\|X\|_2$ denote its Frobenius norm and its spectral norm, respectively. It is proven that $$ \left\|A-A'\right\|_\xi \le \min_{D:\mathrm{rank}(D)\le k} \left\|A-D\right\|_\xi + poly(k,1/c) \left\|A\right\|_F $$ holds in expectation and with high probability for both $\xi = 2,F$ and for all $k=1,\ldots,\mbox{rank}(A)$; thus by appropriate choice of $k$ $$ \left\|A-A'\right\|_2 \le \epsilon \left\|A\right\|_F $$ also holds in expectation and with high probability. This algorithm may be implemented without storing the matrix $A$ in RAM, provided it can make two passes over the matrix stored in external memory and use $O(m+n)$ additional RAM (assuming that $c$ and $r$ are constants, independent of the size of the input). The second algorithm is similar except that it approximates the matrix $C$ by randomly sampling a constant number of rows of $C$. Thus, it has additional error but it can be implemented in three passes over the matrix using only constant additional RAM. To achieve an additional error (beyond the best rank-$k$ approximation) that is at most $\epsilon \|A\|_F$, both algorithms take time which is a low-degree polynomial in $k$, $1/\epsilon$, and $1/\delta$, where $\delta0$ is a failure probability; the first takes time linear in $\mbox{max}(m,n)$ and the second takes time independent of $m$ and $n$. The proofs for the error bounds make important use of matrix perturbation theory and previous work on approximating matrix multiplication and computing low-rank approximations to a matrix. The probability distribution over columns and rows and the rescaling are crucial features of the algorithms and must be chosen judiciously.