A fast algorithm for particle simulations
Journal of Computational Physics
The Johnson-Lindenstrauss Lemma and the sphericity of some graphs
Journal of Combinatorial Theory Series A
Sensitivity analysis in linear regression
Sensitivity analysis in linear regression
SIAM Journal on Scientific and Statistical Computing
Structure-preserving and rank-revealing QR-factorizations
SIAM Journal on Scientific and Statistical Computing
On Rank-Revealing Factorisations
SIAM Journal on Matrix Analysis and Applications
Randomized algorithms
Efficient algorithms for computing a strong rank-revealing QR factorization
SIAM Journal on Scientific Computing
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Computing rank-revealing QR factorizations of dense matrices
ACM Transactions on Mathematical Software (TOMS)
Algorithm 782: codes for rank-revealing QR factorizations of dense matrices
ACM Transactions on Mathematical Software (TOMS)
Algorithm 583: LSQR: Sparse Linear Equations and Least Squares Problems
ACM Transactions on Mathematical Software (TOMS)
Latent semantic indexing: a probabilistic analysis
Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
Random projection in dimensionality reduction: applications to image and text data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
An elementary proof of a theorem of Johnson and Lindenstrauss
Random Structures & Algorithms
Mining knowledge-sharing sites for viral marketing
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Handbook of massive data sets
Database-friendly random projections: Johnson-Lindenstrauss with binary coins
Journal of Computer and System Sciences - Special issu on PODS 2001
Experiments with random projections for machine learning
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering Large Graphs via the Singular Value Decomposition
Machine Learning
Fast monte-carlo algorithms for finding low-rank approximations
Journal of the ACM (JACM)
Sampling algorithms for l2 regression and applications
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Email Surveillance Using Non-negative Matrix Factorization
Computational & Mathematical Organization Theory
Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform
Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
Graph mining: Laws, generators, and algorithms
ACM Computing Surveys (CSUR)
Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication
SIAM Journal on Computing
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix
SIAM Journal on Computing
SIAM Journal on Computing
Tensor-CUR decompositions for tensor-based data
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Improved Approximation Algorithms for Large Matrices via Random Projections
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
The Journal of Machine Learning Research
Fast computation of low-rank matrix approximations
Journal of the ACM (JACM)
Sampling from large matrices: An approach through geometric functional analysis
Journal of the ACM (JACM)
An estimator for the diagonal of a matrix
Applied Numerical Mathematics
Fast Directional Multilevel Algorithms for Oscillatory Kernels
SIAM Journal on Scientific Computing
Fast Computation of Fourier Integral Operators
SIAM Journal on Scientific Computing
Fast dimension reduction using Rademacher series on dual BCH codes
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Graph sparsification by effective resistances
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
CRD: fast co-clustering on large datasets utilizing sampling-based matrix decomposition
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Improved Nyström low-rank approximation and error analysis
Proceedings of the 25th international conference on Machine learning
On variants of the Johnson–Lindenstrauss lemma
Random Structures & Algorithms
Unsupervised feature selection for principal components analysis
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Colibri: fast mining of large static and dynamic graphs
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Dense Fast Random Projections and Lean Walsh Transforms
APPROX '08 / RANDOM '08 Proceedings of the 11th international workshop, APPROX 2008, and 12th international workshop, RANDOM 2008 on Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques
Relative-Error $CUR$ Matrix Decompositions
SIAM Journal on Matrix Analysis and Applications
An improved approximation algorithm for the column subset selection problem
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Deterministic Sparse Column Based Matrix Reconstruction via Greedy Approximation of SVD
ISAAC '08 Proceedings of the 19th International Symposium on Algorithms and Computation
Numerical linear algebra in the streaming model
Proceedings of the forty-first annual ACM symposium on Theory of computing
A fast and efficient algorithm for low-rank approximation of a matrix
Proceedings of the forty-first annual ACM symposium on Theory of computing
Graph spectra as a systematic tool in computational biology
Discrete Applied Mathematics
On sampling-based approximate spectral decomposition
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
On selecting a maximum volume sub-matrix of a matrix and related problems
Theoretical Computer Science
Communications of the ACM
The Fast Johnson-Lindenstrauss Transform and Approximate Nearest Neighbors
SIAM Journal on Computing
Exact Matrix Completion via Convex Optimization
Foundations of Computational Mathematics
A Randomized Algorithm for Principal Component Analysis
SIAM Journal on Matrix Analysis and Applications
A sparse Johnson: Lindenstrauss transform
Proceedings of the forty-second ACM symposium on Theory of computing
Clustered Nyström method for large scale manifold learning and dimension reduction
IEEE Transactions on Neural Networks
Blendenpik: Supercharging LAPACK's Least-Squares Solver
SIAM Journal on Scientific Computing
SelInv---An Algorithm for Selected Inversion of a Sparse Symmetric Matrix
ACM Transactions on Mathematical Software (TOMS)
Faster least squares approximation
Numerische Mathematik
Fast construction of hierarchical matrix representation from matrix-vector multiplication
Journal of Computational Physics
A Fast Randomized Algorithm for Orthogonal Projection
SIAM Journal on Scientific Computing
SIAM Journal on Scientific Computing
An almost optimal unrestricted fast Johnson-Lindenstrauss transform
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Low rank matrix-valued chernoff bounds and approximate matrix multiplication
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Adaptive sampling and fast low-rank matrix approximation
APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Towards large scale continuous EDA: a random matrix theory perspective
Proceedings of the 15th annual conference on Genetic and evolutionary computation
Fast approximation of matrix coherence and statistical leverage
The Journal of Machine Learning Research
A scalable approach to column-based low-rank matrix approximation
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling
The Journal of Machine Learning Research
Large-scale SVD and manifold learning
The Journal of Machine Learning Research
Column Subset Selection Problem is UG-hard
Journal of Computer and System Sciences
Hi-index | 0.00 |
Randomized algorithms for very large matrix problems have received a great deal of attention in recent years. Much of this work was motivated by problems in large-scale data analysis, largely since matrices are popular structures with which to model data drawn from a wide range of application domains, and this work was performed by individuals from many different research communities. While the most obvious benefit of randomization is that it can lead to faster algorithms, either in worst-case asymptotic theory and/or numerical implementation, there are numerous other benefits that are at least as important. For example, the use of randomization can lead to simpler algorithms that are easier to analyze or reason about when applied in counterintuitive settings; it can lead to algorithms with more interpretable output, which is of interest in applications where analyst time rather than just computational time is of interest; it can lead implicitly to regularization and more robust output; and randomized algorithms can often be organized to exploit modern computational architectures better than classical numerical methods. This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis. Throughout this review, an emphasis will be placed on a few simple core ideas that underlie not only recent theoretical advances but also the usefulness of these tools in large-scale data applications. Crucial in this context is the connection with the concept of statistical leverage. This concept has long been used in statistical regression diagnostics to identify outliers; and it has recently proved crucial in the development of improved worst-case matrix algorithms that are also amenable to high-quality numerical implementation and that are useful to domain scientists. This connection arises naturally when one explicitly decouples the effect of randomization in these matrix algorithms from the underlying linear algebraic structure. This decoupling also permits much finer control in the application of randomization, as well as the easier exploitation of domain knowledge. Most of the review will focus on random sampling algorithms and random projection algorithms for versions of the linear least-squares problem and the low-rank matrix approximation problem. These two problems are fundamental in theory and ubiquitous in practice. Randomized methods solve these problems by constructing and operating on a randomized sketch of the input matrix A — for random sampling methods, the sketch consists of a small number of carefully-sampled and rescaled columns/rows of A, while for random projection methods, the sketch consists of a small number of linear combinations of the columns/rows of A. Depending on the specifics of the situation, when compared with the best previously-existing deterministic algorithms, the resulting randomized algorithms have worst-case running time that is asymptotically faster; their numerical implementations are faster in terms of clock-time; or they can be implemented in parallel computing environments where existing numerical algorithms fail to run at all. Numerous examples illustrating these observations will be described in detail.