Efficient pre-processing in the parallel block-Jacobi SVD algorithm
Parallel Computing - Parallel matrix algorithms and applications (PMAA'04)
On the Failure of Rank-Revealing QR Factorization Software -- A Case Study
ACM Transactions on Mathematical Software (TOMS)
Implementing a parallel matrix factorization library on the cell broadband engine
Scientific Programming - High Performance Computing with the Cell Broadband Engine
Accelerating geoscience and engineering system simulations on graphics hardware
Computers & Geosciences
Fast dimension reduction for document classification based on imprecise spectrum analysis
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A Global Convergence Proof for Cyclic Jacobi Methods with Block Rotations
SIAM Journal on Matrix Analysis and Applications
Hi-index | 0.00 |
This paper is the result of concerted efforts to break the barrier between numerical accuracy and run-time efficiency in computing the fundamental decomposition of numerical linear algebra—the singular value decomposition (SVD) of general dense matrices. It is an unfortunate fact that the numerically most accurate one-sided Jacobi SVD algorithm is several times slower than generally less accurate bidiagonalization-based methods such as the QR or the divide-and-conquer algorithm. Our quest for a highly accurate and efficient SVD algorithm has led us to a new, superior variant of the Jacobi algorithm. The new algorithm has inherited all good high accuracy properties of the Jacobi algorithm, and it can outperform the QR algorithm.