GPU-based parallel algorithms for sparse nonlinear systems

Authors:
V. Galiano;H. MigallóN;V. MigallóN;J. PenadéS
Affiliations:
Department of Physics and Computer Architectures, University Miguel Hernández, E-03202, Elche, Alicante, Spain;Department of Physics and Computer Architectures, University Miguel Hernández, E-03202, Elche, Alicante, Spain;Department of Computer Science and Artificial Intelligence, University of Alicante, E-03071, Alicante, Spain;Department of Computer Science and Artificial Intelligence, University of Alicante, E-03071, Alicante, Spain
Venue:
Journal of Parallel and Distributed Computing
Year:
2012

Citing 11
Cited 3

A brief review of the ITPACK project

Journal of Computational and Applied Mathematics - Special issue on iterative methods for the solution of linear systems
Basic Linear Algebra Subprograms for Fortran Usage

ACM Transactions on Mathematical Software (TOMS)
Sparse matrix solvers on the GPU: conjugate gradients and multigrid

ACM SIGGRAPH 2003 Papers
Scalable Parallel Programming with CUDA

Queue - GPU Computing
NVIDIA Tesla: A Unified Graphics and Computing Architecture

IEEE Micro
Concurrent number cruncher: a GPU implementation of a general sparse linear solver

International Journal of Parallel, Emergent and Distributed Systems
Improving the Performance of the Sparse Matrix Vector Product with GPUs

CIT '10 Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology
A new approach for sparse matrix vector product on NVIDIA GPUs

Concurrency and Computation: Practice & Experience
GPU-based parallel solver via the Kantorovich theorem for the nonlinear Bernstein polynomial systems

Computers & Mathematics with Applications
Solving non-linear systems of equations on graphics processing units

LSSC'09 Proceedings of the 7th international conference on Large-Scale Scientific Computing
Parallel preconditioned conjugate gradient algorithm on GPU

Journal of Computational and Applied Mathematics

Editorial: Special issue editorial: Accelerators for high-performance computing

Journal of Parallel and Distributed Computing
Efficient heterogeneous execution on large multicore and accelerator platforms: Case study using a block tridiagonal solver

Journal of Parallel and Distributed Computing
Research on the conjugate gradient algorithm with a modified incomplete Cholesky preconditioner on GPU

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work we describe some parallel algorithms for solving nonlinear systems using CUDA (Compute Unified Device Architecture) over a GPU (Graphics Processing Unit). The proposed algorithms are based on both the Fletcher-Reeves version of the nonlinear conjugate gradient method and a polynomial preconditioner type based on block two-stage methods. Several strategies of parallelization and different storage formats for sparse matrices are discussed. The reported numerical experiments analyze the behavior of these algorithms working in a fine grain parallel environment compared with a thread-based environment.