Implementation of the GMRES method using householder transformations
SIAM Journal on Scientific and Statistical Computing - Telecommunication Programs at U.S. Universities
ACM Transactions on Mathematical Software (TOMS)
Krylov subspace methods on supercomputers
SIAM Journal on Scientific and Statistical Computing
Algorithm 686: FORTRAN subroutines for updating the QR decomposition
ACM Transactions on Mathematical Software (TOMS)
Applied Numerical Mathematics
On restarting the Arnoldi method for large nonsymmetric eigenvalue problems
Mathematics of Computation
An Efficient Implementation of the Nonsymmetric Lanczos Algorithm
SIAM Journal on Matrix Analysis and Applications
A block variant of the GMRES method on massively parallel processors
Parallel Computing
Parallel implementation of a multiblock method with approximate subdomain solution
Applied Numerical Mathematics
Parallel empirical pseudopotential electronic structure calculations for million atom systems
Journal of Computational Physics
A Block Orthogonalization Procedure with Constant Synchronization Requirements
SIAM Journal on Scientific Computing
A Test Matrix Collection for Non-Hermitian Eigenvalue Problems
A Test Matrix Collection for Non-Hermitian Eigenvalue Problems
ACM Transactions on Mathematical Software (TOMS)
SLEPc: A scalable and flexible toolkit for the solution of eigenvalue problems
ACM Transactions on Mathematical Software (TOMS) - Special issue on the Advanced CompuTational Software (ACTS) Collection
VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
A Parallel Implementation of the Trace Minimization Eigensolver
High Performance Computing for Computational Science - VECPAR 2008
PRIMME: preconditioned iterative multimethod eigensolver—methods and software description
ACM Transactions on Mathematical Software (TOMS)
Fast eigenvalue calculations in a massively parallel plasma turbulence code
Parallel Computing
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
A parallel implementation of the Jacobi-Davidson eigensolver for unsymmetric matrices
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
A parallel solution of large-scale heat equation based on distributed memory hierarchy system
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Strategies for spectrum slicing based on restarted Lanczos methods
Numerical Algorithms
CUDA acceleration of a matrix-free Rosenbrock-K method applied to the shallow water equations
ScalA '13 Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems
A parallel implementation of Davidson methods for large-scale eigenvalue problems in SLEPc
ACM Transactions on Mathematical Software (TOMS)
Hi-index | 0.01 |
This paper presents several new variants of the single-vector Arnoldi algorithm for computing approximations to eigenvalues and eigenvectors of a non-symmetric matrix. The context of this work is the efficient implementation of industrial-strength, parallel, sparse eigensolvers, in which robustness is of paramount importance, as well as efficiency. For this reason, Arnoldi variants that employ Gram-Schmidt with iterative reorthogonalization are considered. The proposed algorithms aim at improving the scalability when running in massively parallel platforms with many processors. The main goal is to reduce the performance penalty induced by global communications required in vector inner products and norms. In the proposed algorithms, this is achieved by reorganizing the stages that involve these operations, particularly the orthogonalization and normalization of vectors, in such a way that several global communications are grouped together while guaranteeing that the numerical stability of the process is maintained. The numerical properties of the new algorithms are assessed by means of a large set of test matrices. Also, scalability analyses show a significant improvement in parallel performance.