Parallel Performance Analysis of the Improved Quasi-Minimal Residual Method on Bulk Synchronous Parallel Architectures

Authors:
Tianruo Yang;Hai-Xiang Linh
Affiliations:
Department of Computer and Information Science, Linköping University, S-581 83, Linköping, Sweden;Department of Technical Mathematics and Computer Science, TU Delft, Mekelweg 4, 2628 CD, Delft, The Netherlands
Venue:
The Journal of Supercomputing
Year:
1999

Citing 11
Cited 1

A bridging model for parallel computation

Communications of the ACM
An efficient nonsymmetric Lanczos method on parallel vector computers

Journal of Computational and Applied Mathematics
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices

SIAM Journal on Scientific Computing
An implementation of the QMR method based on coupled two-term recurrences

SIAM Journal on Scientific Computing
Parallel iterative solution of sparse linear systems on a transputer network

Parallel computation
Solving Linear Systems on Vector and Shared Memory Computers

Solving Linear Systems on Vector and Shared Memory Computers
Parallel Ocean Flow Computations on a Regular and on an Irregular Grid

HPCN Europe 1996 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
The Improved Quasi-minimal Residual Method on Massively Distributed Memory Computers

HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Parallel iterative solution methods for linear finite element computations on the Cray T3D

HPCN Europe '95 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
A Parallel Version of the Quasi-Minimal Residual Method, Based on Coupled Two-Term Recurrences

PARA '96 Proceedings of the Third International Workshop on Applied Parallel Computing, Industrial Computation and Optimization
Solving Sparse Least Squares Problems on Massively Distributed Memory Computers

APDC '97 Proceedings of the 1997 Advances in Parallel and Distributed Computing Conference (APDC '97)

Benchmarking the CLI for I/O-Intensive Computing

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 13 - Volume 14

Quantified Score

Hi-index	0.00

Visualization

Abstract

For the solutions of unsymmetric linear systems of equations, we have proposed an improved version of the quasi-minimal residual (IQMR) method [21] by using the Lanczos process as a major component combining elements of numerical stability and parallel algorithm design. For Lanczos process, stability is obtained by a couple two-term procedure that generates Lanczos vectors scaled to unit length. The algorithm is derived such that all inner products and matrix-vector multiplications of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time. In this paper, we use the Bulk Synchronous Parallel (BSP) model to design a fully efficient, scalable and portable parallel IQMR algorithm and to provide accurate performance prediction of the algorithm for a wide range of architectures including the Cray T3D, the Parsytec GC/PowerPlus, and a cluster of workstations connected by an Ethernet. This performance model provides us useful insight in the time complexity of the IQMR method using only a few system dependent parameters based on a simple and accurate cost modeling. The theoretical performance prediction are compared with measured timing results of a numerical application from ocean flow simulation.