Algorithms and experiments for structural mechanics on high-performance architectures
Computer Methods in Applied Mechanics and Engineering
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Polynomial preconditioning for conjugate gradient methods
Polynomial preconditioning for conjugate gradient methods
Communicating sequential processes
Communications of the ACM
Solving Linear Systems on Vector and Shared Memory Computers
Solving Linear Systems on Vector and Shared Memory Computers
CONPAR 90/VAPP IV Proceedings of the Joint International Conference on Vector and Parallel Processing
Hi-index | 0.00 |
In this paper we study the parallelization of PCGLS, a basic iterative method which main idea is to organize the computation of conjugate gradient method with preconditioner applied to normal equations. Two important schemes are discussed. What is the best possible data distribution and which communication network topology is most suitable for solving least squares problems on massively parallel distributed memory computers. A theoretical model of data distribution and communication phases is presented which allows us to give a detail execution time complexity analysis and to investigate its usefulness. It is shown that the implementation of PCGLS, with a row-block decomposition of the coefficient matrix, on a ring of communication structure is the most efficient choice. Performance tests of the developed parallel PCGLS algorithm have been carried out on the massively distributed memory system Parsytec and experimental timing results are compared with the theoretical execution time complexity analysis.