Representing linear algebra algorithms in code: the FLAME application program interfaces
ACM Transactions on Mathematical Software (TOMS)
The design and implementation of the MRRR algorithm
ACM Transactions on Mathematical Software (TOMS)
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
A new algorithm for singular value decomposition and its parallelization
Parallel Computing
ACM Transactions on Mathematical Software (TOMS)
Reduction to condensed forms for symmetric eigenvalue problems on multi-core architectures
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
On parallelizing the MRRR algorithm for data-parallel coprocessors
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
A Novel Parallel QR Algorithm for Hybrid Distributed Memory HPC Systems
SIAM Journal on Scientific Computing
MR3-SMP: A symmetric tridiagonal eigensolver for multi-core architectures
Parallel Computing
The algorithm of multiple relatively robust representations for multi-core processors
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume Part I
Implementations of main algorithms for generalized eigenproblem on GPU accelerator
ICSI'12 Proceedings of the Third international conference on Advances in Swarm Intelligence - Volume Part II
Elemental: A New Framework for Distributed Memory Dense Matrix Computations
ACM Transactions on Mathematical Software (TOMS)
Hi-index | 0.00 |
We present a new parallel algorithm for the dense symmetric eigenvalue/eigenvector problem that is based upon the tridiagonal eigensolver, Algorithm $\mbox{\sf MR}^3$, recently developed by Dhillon and Parlett. Algorithm $\mbox{\sf MR}^3$ has a complexity of O(n2) operations for computing all eigenvalues and eigenvectors of a symmetric tridiagonal problem. Moreover the algorithm requires only O(n) extra workspace and can be adapted to compute any subset of k eigenpairs in O(nk) time. In contrast, all earlier stable parallel algorithms for the tridiagonal eigenproblem require O(n3) operations in the worst case, while some implementations, such as divide and conquer, have an extra O(n2) memory requirement. The proposed parallel algorithm balances the workload equally among the processors by traversing a matrix-dependent representation tree which captures the sequence of computations performed by Algorithm $\mbox{\sf MR}^3$. The resulting implementation allows problems of very large size to be solved efficiently---the largest dense eigenproblem solved in-core on a 256 processor machine with 2 GBytes of memory per processor is for a matrix of size 128,000 $\times$ 128,000, which required about 8 hours of CPU time. We present comparisons with other eigensolvers and results on matrices that arise in the applications of computational quantum chemistry and finite element modeling of automobile bodies.