Subgradient and sampling algorithms for l1 regression
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Sampling algorithms for l2 regression and applications
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Tensor-CUR decompositions for tensor-based data
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Randomized algorithms for matrices and massive data sets
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
The Journal of Machine Learning Research
Subspace sampling and relative-error matrix approximation: column-row-based methods
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Sampling subproblems of heterogeneous Max-Cut problems and approximation algorithms
Random Structures & Algorithms
Colibri: fast mining of large static and dynamic graphs
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Spectral Curvature Clustering (SCC)
International Journal of Computer Vision
Factored value iteration converges
Acta Cybernetica
Foundations and Trends® in Theoretical Computer Science
Automatica (Journal of IFAC)
Spectral methods for matrices and tensors
Proceedings of the forty-second ACM symposium on Theory of computing
Projected Landweber iteration for matrix completion
Journal of Computational and Applied Mathematics
Fast Algorithms for Approximating the Singular Value Decomposition
ACM Transactions on Knowledge Discovery from Data (TKDD)
A note on element-wise matrix sparsification via a matrix-valued Bernstein inequality
Information Processing Letters
Stochastic algorithms in linear algebra: beyond the Markov chains and von Neumann-Ulam scheme
NMA'10 Proceedings of the 7th international conference on Numerical methods and applications
Multiplicative approximations of random walk transition probabilities
APPROX'11/RANDOM'11 Proceedings of the 14th international workshop and 15th international conference on Approximation, randomization, and combinatorial optimization: algorithms and techniques
Fast PCA for processing calcium-imaging data from the brain of drosophila melanogaster
Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
SIAM Journal on Scientific Computing
Compressed matrix multiplication
Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
Low rank matrix-valued chernoff bounds and approximate matrix multiplication
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Subspace sampling and relative-error matrix approximation: column-based methods
APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Randomized Algorithms for Matrices and Data
Foundations and Trends® in Machine Learning
Sampling techniques for monte carlo matrix multiplication with applications to image processing
MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
Graph coarsening for path finding in cybersecurity graphs
Proceedings of the Eighth Annual Cyber Security and Information Intelligence Research Workshop
Low rank approximation and regression in input sparsity time
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Compressed matrix multiplication
ACM Transactions on Computation Theory (TOCT) - Special issue on innovations in theoretical computer science 2012
Asymptotic error bounds for kernel-based Nyström low-rank approximation matrices
Journal of Multivariate Analysis
Fast approximation of matrix coherence and statistical leverage
The Journal of Machine Learning Research
Hi-index | 0.00 |
Motivated by applications in which the data may be formulated as a matrix, we consider algorithms for several common linear algebra problems. These algorithms make more efficient use of computational resources, such as the computation time, random access memory (RAM), and the number of passes over the data, than do previously known algorithms for these problems. In this paper, we devise two algorithms for the matrix multiplication problem. Suppose $A$ and $B$ (which are $m\times n$ and $n\times p$, respectively) are the two input matrices. In our main algorithm, we perform $c$ independent trials, where in each trial we randomly sample an element of $\{ 1,2,\ldots, n\}$ with an appropriate probability distribution ${\cal P}$ on $\{ 1,2,\ldots, n\}$. We form an $m\times c$ matrix $C$ consisting of the sampled columns of $A$, each scaled appropriately, and we form a $c\times n$ matrix $R$ using the corresponding rows of $B$, again scaled appropriately. The choice of ${\cal P}$ and the column and row scaling are crucial features of the algorithm. When these are chosen judiciously, we show that $CR$ is a good approximation to $AB$. More precisely, we show that $$ \left\|AB-CR\right\|_F = O(\left\|A\right\|_F \left\|B\right\|_F /\sqrt c) , $$ where $\|\cdot\|_F$ denotes the Frobenius norm, i.e., $\|A\|^2_F=\sum_{i,j}A_{ij}^2$. This algorithm can be implemented without storing the matrices $A$ and $B$ in RAM, provided it can make two passes over the matrices stored in external memory and use $O(c(m+n+p))$ additional RAM to construct $C$ and $R$. We then present a second matrix multiplication algorithm which is similar in spirit to our main algorithm. In addition, we present a model (the pass-efficient model) in which the efficiency of these and other approximate matrix algorithms may be studied and which we argue is well suited to many applications involving massive data sets. In this model, the scarce computational resources are the number of passes over the data and the additional space and time required by the algorithm. The input matrices may be presented in any order of the entries (and not just row or column order), as is the case in many applications where, e.g., the data has been written in by multiple agents. In addition, the input matrices may be presented in a sparse representation, where only the nonzero entries are written.