GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems
SIAM Journal on Scientific and Statistical Computing
SIAM Journal on Scientific and Statistical Computing
CGS, a fast Lanczos-type solver for nonsymmetric linear systems
SIAM Journal on Scientific and Statistical Computing
Modeling overlapped operation between the control unit and processing elements in an SIMD machine
Journal of Parallel and Distributed Computing
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Highly parallel computing (2nd ed.)
Highly parallel computing (2nd ed.)
Proceedings of the 1994 simulation multiconference on Grand challenges in computer simulation
Performance and Scalability of Preconditioned Conjugate Gradient Methods on Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Iterative solution methods
Finite element solution of boundary value problems: theory and computation
Finite element solution of boundary value problems: theory and computation
Data Management and Control-Flow Aspects of an SIMD/SPMD Parallel Language/Compiler
IEEE Transactions on Parallel and Distributed Systems
SmartNet: a scheduling framework for heterogeneous computing
ISPAN '96 Proceedings of the 1996 International Symposium on Parallel Architectures, Algorithms and Networks
International Journal of Parallel Programming
OTM'05 Proceedings of the 2005 OTM Confederated international conference on On the Move to Meaningful Internet Systems
Hi-index | 0.00 |
The performance of conjugate gradient (CG) algorithms for the solution of the system of linear equations that results from the finite-differencing of the neutron diffusion equation was analyzed on SIMD, MIMD, and mixed-mode parallel machines. A block preconditioner based on the incomplete Cholesky factorization was used to accelerate the conjugate gradient search. The issues involved in mapping both the unpreconditioned and preconditioned conjugate gradient algorithms onto the mixed-mode PASM prototype, the SIMD MasPar MP-1, and the MIMD Intel Paragon XP/S are discussed. On PASM , the mixed-mode implementation outperformed either SIMD or MIMD alone. Theoretical performance predictions were analyzed and compared with the experimental results on the MasPar MP-1 and the Paragon XP/S. Other issues addressed include the impact on execution time of the number of processors used, the effect of the interprocessor communication network on performance, and the relationship of the number of processors to the quality of the preconditioning. Applications studies such as this are necessary in the development of software tools for mapping algorithms onto either a single parallel machine or a heterogeneous suite of parallel machines.