On the Schur Decomposition of a Matrix for Parallel Computation
IEEE Transactions on Computers
Finding eigenvalues and eigenvectors of unsymmetric matrices using a hypercube multiprocessor
C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
A set of level 3 basic linear algebra subprograms
ACM Transactions on Mathematical Software (TOMS)
On a block implementation of Hessenberg multishift QR iteration
International Journal of High Speed Computing
Deferred shifting schemes for parallel QR methods
SIAM Journal on Matrix Analysis and Applications
Shifting strategies for the parallel QR algorithm
SIAM Journal on Scientific Computing
A new efficient parallelization strategy for the QR algorithm
Parallel Computing
Forward Stability and Transmission of Shifts in the $QR$ Algorithm
SIAM Journal on Matrix Analysis and Applications
Parallelizing the QR algorithm for the unsymmetric algebraic eigenvalue problem: myths and reality
SIAM Journal on Scientific Computing
Matrix computations (3rd ed.)
ScaLAPACK user's guide
Using Level 3 BLAS in Rotation-Based Algorithms
SIAM Journal on Scientific Computing
Condition Numbers of Random Triangular Matrices
SIAM Journal on Matrix Analysis and Applications
GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark
ACM Transactions on Mathematical Software (TOMS)
LAPACK Users' guide (third ed.)
LAPACK Users' guide (third ed.)
Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form
ACM Transactions on Mathematical Software (TOMS)
Templates for the solution of algebraic eigenvalue problems: a practical guide
Templates for the solution of algebraic eigenvalue problems: a practical guide
Accuracy and Stability of Numerical Algorithms
Accuracy and Stability of Numerical Algorithms
A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures
SIAM Journal on Scientific Computing
The Multishift QR Algorithm. Part I: Maintaining Well-Focused Shifts and Level 3 Performance
SIAM Journal on Matrix Analysis and Applications
The Multishift QR Algorithm. Part II: Aggressive Early Deflation
SIAM Journal on Matrix Analysis and Applications
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
A Test Matrix Collection for Non-Hermitian Eigenvalue Problems
A Test Matrix Collection for Non-Hermitian Eigenvalue Problems
SIAM Journal on Scientific Computing
Block algorithms for reordering standard and generalized Schur forms
ACM Transactions on Mathematical Software (TOMS)
Multishift Variants of the QZ Algorithm with Aggressive Early Deflation
SIAM Journal on Matrix Analysis and Applications
SIAM Review
The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods
The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods
The Effect of Aggressive Early Deflation on the Convergence of the QR Algorithm
SIAM Journal on Matrix Analysis and Applications
Parallel eigenvalue reordering in real Schur forms
Concurrency and Computation: Practice & Experience
Parallel variants of the multishift QZ algorithm with advanced deflation techniques
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Scheduling two-sided transformations using tile algorithms on multicore architectures
Scientific Programming
ACM Transactions on Mathematical Software (TOMS)
ACM Transactions on Mathematical Software (TOMS)
Performance modeling and optimal block size selection for the small-bulge multishift QR algorithm
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
ACM Transactions on Mathematical Software (TOMS)
On aggressive early deflation in parallel variants of the QR algorithm
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume Part I
Parallel reduction to hessenberg form with algorithm-based fault tolerance
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Optimally packed chains of bulges in multishift QR algorithms
ACM Transactions on Mathematical Software (TOMS)
Visualizing large-scale parallel communication traces using a particle animation technique
EuroVis '13 Proceedings of the 15th Eurographics Conference on Visualization
Hi-index | 0.00 |
A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing systems is presented. For this purpose, we introduce the concept of multiwindow bulge chain chasing and parallelize aggressive early deflation. The multiwindow approach ensures that most computations when chasing chains of bulges are performed in level 3 BLAS operations, while the aim of aggressive early deflation is to speed up the convergence of the QR algorithm. Mixed MPI-OpenMP coding techniques are utilized for porting the codes to distributed memory platforms with multithreaded nodes, such as multicore processors. Numerous numerical experiments confirm the superior performance of our parallel QR algorithm in comparison with the existing ScaLAPACK code, leading to an implementation that is one to two orders of magnitude faster for sufficiently large problems, including a number of examples from applications.