Domain decomposition for parallel row projection algorithms
Applied Numerical Mathematics - II on Domain decomposition; Guest Editor: W. Proskurowski
A block projection method for sparse matrices
SIAM Journal on Scientific and Statistical Computing - Special issue on iterative methods in numerical linear algebra
Row projection methods for large nonsymmetric linear systems
SIAM Journal on Scientific and Statistical Computing - Special issue on iterative methods in numerical linear algebra
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
Linear algebra operators for GPU implementation of numerical algorithms
ACM SIGGRAPH 2003 Papers
Sparse matrix solvers on the GPU: conjugate gradients and multigrid
ACM SIGGRAPH 2003 Papers
LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Component-Averaged Row Projections: A Robust, Block-Parallel Scheme for Sparse Linear Systems
SIAM Journal on Scientific Computing
ACM Transactions on Mathematical Software (TOMS)
International Journal of Parallel, Emergent and Distributed Systems
Editorial: Special issue: General-purpose processing using graphics processing units
Journal of Parallel and Distributed Computing
A randomized solver for linear systems with exponential convergence
APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
Inconsistent signal feasibility problems: least-squares solutionsin a product space
IEEE Transactions on Signal Processing
GPU implementation of a Helmholtz Krylov solver preconditioned by a shifted Laplace multigrid method
Journal of Computational and Applied Mathematics
Parallel design for error-resilient entropy coding algorithm on GPU
Journal of Parallel and Distributed Computing
A generalized Block FSAI preconditioner for nonsymmetric linear systems
Journal of Computational and Applied Mathematics
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
The graphics processing unit (GPU) is used to solve large linear systems derived from partial differential equations. The differential equations studied are strongly convection-dominated, of various sizes, and common to many fields, including computational fluid dynamics, heat transfer, and structural mechanics. The paper presents comparisons between GPU and CPU implementations of several well-known iterative methods, including Kaczmarz's, Cimmino's, component averaging, conjugate gradient normal residual (CGNR), symmetric successive overrelaxation-preconditioned conjugate gradient, and conjugate-gradient-accelerated component-averaged row projections (CARP-CG). Computations are preformed with dense as well as general banded systems. The results demonstrate that our GPU implementation outperforms CPU implementations of these algorithms, as well as previously studied parallel implementations on Linux clusters and shared memory systems. While the CGNR method had begun to fall out of favor for solving such problems, for the problems studied in this paper, the CGNR method implemented on the GPU performed better than the other methods, including a cluster implementation of the CARP-CG method.