Stability analysis and improvement of the block Gram-Schmidt algorithm
SIAM Journal on Scientific and Statistical Computing
Fast monte-carlo algorithms for finding low-rank approximations
Journal of the ACM (JACM)
LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Improved Approximation Algorithms for Large Matrices via Random Projections
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Singular value decomposition on GPU using CUDA
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
A Randomized Algorithm for Principal Component Analysis
SIAM Journal on Matrix Analysis and Applications
Adaptive sampling and fast low-rank matrix approximation
APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Hi-index | 0.00 |
Approximation of matrices using the Singular Value Decomposition (SVD) plays a central role in many science and engineering applications. However, the computation cost of an exact SVD is prohibitively high for very large matrices. In this paper, we describe a GPU-based approximate SVD algorithm for large matrices. Our method is based on the QUIC-SVD introduced by [6], which exploits a tree-based structure to efficiently discover a subset of rows that spans the matrix space. We describe how to map QUIC-SVD onto the GPU, and improve its speed and stability using a blocked Gram-Schmidt orthogonalization method. Using a simple matrix partitioning scheme, we have extended our algorithm to out-of-core computation, suitable for very large matrices that exceed the main memory size. Results show that our GPU algorithm achieves 6˜7 times speedup over an optimized CPU version of QUIC-SVD, which itself is orders of magnitude faster than exact SVD methods.