The Improved CGS Method for Large and Sparse Linear Systems on Bulk Synchronous Parallel Architecture

  • Authors:
  • Affiliations:
  • Venue:
  • ICA3PP '02 Proceedings of the Fifth International Conference on Algorithms and Architectures for Parallel Processing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose an improved version of the CGS methodfor the solutions of large and sparse linear systems ofequations with unsymmetric coefficient matrices. Theproposed method combines elements of numerical stability and parallel algorithm design without increasingcomputational costs. The algorithm is derived suchthat all matrix-vector multiplication, inner productsand vector updates of a single iteration step are independent and communication time required for innerproduct can be overlapped efficiently with computationtime of vector updates. Therefore, the cost of globalcommunication which represents the bottleneck of theperformance can be significantly reduced. In this paper, the Bulk Synchronous Parallel (BSP) model is usedto design a fully efficient, scalable and portable parallel proposed algorithm and to provide accurate performance prediction of the algorithm for a wide rangeof architectures including the Cray T3D, the Parsytec,and a cluster of workstations connected by an Ethernet. This performance model uses only a few systemdependent parameters based on a simple and accuratecost modelling to provide useful insight in the time complexity of the method. The theoretical performance prediction are compared with some preliminary measuredtiming results of a numerical application from oceanow simulation.