On a block implementation of Hessenberg multishift QR iteration
International Journal of High Speed Computing
Parallelizing the QR algorithm for the unsymmetric algebraic eigenvalue problem: myths and reality
SIAM Journal on Scientific Computing
ScaLAPACK user's guide
LAPACK Users' guide (third ed.)
LAPACK Users' guide (third ed.)
A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures
SIAM Journal on Scientific Computing
The Multishift QR Algorithm. Part I: Maintaining Well-Focused Shifts and Level 3 Performance
SIAM Journal on Matrix Analysis and Applications
The Multishift QR Algorithm. Part II: Aggressive Early Deflation
SIAM Journal on Matrix Analysis and Applications
The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods
The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods
The Effect of Aggressive Early Deflation on the Convergence of the QR Algorithm
SIAM Journal on Matrix Analysis and Applications
Parallel eigenvalue reordering in real Schur forms
Concurrency and Computation: Practice & Experience
Parallel variants of the multishift QZ algorithm with advanced deflation techniques
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
A Novel Parallel QR Algorithm for Hybrid Distributed Memory HPC Systems
SIAM Journal on Scientific Computing
Parallel reduction to hessenberg form with algorithm-based fault tolerance
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
The QR algorithm computes the Schur form of a matrix and is by far the most popular approach for solving dense nonsymmetric eigenvalue problems. Multishift and aggressive early deflation (AED) techniques have led to significantly more efficient sequential implementations of the QR algorithm during the last decade. More recently, these techniques have been incorporated in a novel parallel QR algorithm on hybrid distributed memory HPC systems. While leading to significant performance improvements, it has turned out that AED may become a computational bottleneck as the number of processors increases. In this paper, we discuss a two-level approach for performing AED in a parallel environment, where the lower level consists of a novel combination of AED with the pipelined QR algorithm implemented in the ScaLAPACK routine PDLAHQR. Numerical experiments demonstrate that this new implementation further improves the performance of the parallel QR algorithm.