Benchmarking the CLI for I/O-Intensive Computing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 13 - Volume 14
An improved parallel hybrid bi-conjugate gradient method suitable for distributed parallel computing
Journal of Computational and Applied Mathematics
Minimizing synchronizations in sparse iterative solvers for distributed supercomputers
Computers & Mathematics with Applications
Hi-index | 0.00 |
We propose an improved version of the CGS methodfor the solutions of large and sparse linear systems ofequations with unsymmetric coefficient matrices. Theproposed method combines elements of numerical stability and parallel algorithm design without increasingcomputational costs. The algorithm is derived suchthat all matrix-vector multiplication, inner productsand vector updates of a single iteration step are independent and communication time required for innerproduct can be overlapped efficiently with computationtime of vector updates. Therefore, the cost of globalcommunication which represents the bottleneck of theperformance can be significantly reduced. In this paper, the Bulk Synchronous Parallel (BSP) model is usedto design a fully efficient, scalable and portable parallel proposed algorithm and to provide accurate performance prediction of the algorithm for a wide rangeof architectures including the Cray T3D, the Parsytec,and a cluster of workstations connected by an Ethernet. This performance model uses only a few systemdependent parameters based on a simple and accuratecost modelling to provide useful insight in the time complexity of the method. The theoretical performance prediction are compared with some preliminary measuredtiming results of a numerical application from oceanow simulation.