The Improved CGS Method for Large and Sparse Linear Systems on Bulk Synchronous Parallel Architecture

Authors:
Affiliations:
Venue:
ICA3PP '02 Proceedings of the Fifth International Conference on Algorithms and Architectures for Parallel Processing
Year:
2002

Citing 0
Cited 3

Benchmarking the CLI for I/O-Intensive Computing

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 13 - Volume 14
An improved parallel hybrid bi-conjugate gradient method suitable for distributed parallel computing

Journal of Computational and Applied Mathematics
Minimizing synchronizations in sparse iterative solvers for distributed supercomputers

Computers & Mathematics with Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose an improved version of the CGS methodfor the solutions of large and sparse linear systems ofequations with unsymmetric coefficient matrices. Theproposed method combines elements of numerical stability and parallel algorithm design without increasingcomputational costs. The algorithm is derived suchthat all matrix-vector multiplication, inner productsand vector updates of a single iteration step are independent and communication time required for innerproduct can be overlapped efficiently with computationtime of vector updates. Therefore, the cost of globalcommunication which represents the bottleneck of theperformance can be significantly reduced. In this paper, the Bulk Synchronous Parallel (BSP) model is usedto design a fully efficient, scalable and portable parallel proposed algorithm and to provide accurate performance prediction of the algorithm for a wide rangeof architectures including the Cray T3D, the Parsytec,and a cluster of workstations connected by an Ethernet. This performance model uses only a few systemdependent parameters based on a simple and accuratecost modelling to provide useful insight in the time complexity of the method. The theoretical performance prediction are compared with some preliminary measuredtiming results of a numerical application from oceanow simulation.