Forward and back substitution algorithms on GPU: a case study on modified incomplete Cholesky Preconditioner for three-dimensional finite difference method

Authors:
Yigitcan Aksari;Harun Artuner
Affiliations:
Department of Computer Engineering, Hacettepe University, Beytepe, Ankara, Turkey;Department of Computer Engineering, Hacettepe University, Beytepe, Ankara, Turkey
Venue:
The Journal of Supercomputing
Year:
2012

Citing 11
Cited 0

Matrix computations (3rd ed.)

Matrix computations (3rd ed.)
Preconditioning techniques for large linear systems: a survey

Journal of Computational Physics
Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems
An Introduction to the Conjugate Gradient Method Without the Agonizing Pain

An Introduction to the Conjugate Gradient Method Without the Agonizing Pain
Sparse matrix solvers on the GPU: conjugate gradients and multigrid

ACM SIGGRAPH 2003 Papers
Accelerating Simulations of Light Scattering Based on Finite-Difference Time-Domain Method with General Purpose GPUs

CSE '08 Proceedings of the 2008 11th IEEE International Conference on Computational Science and Engineering
Neural Network Implementation Using CUDA and OpenMP

DICTA '08 Proceedings of the 2008 Digital Image Computing: Techniques and Applications
3D finite difference computation on GPUs using CUDA

Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA

Journal of Parallel and Distributed Computing
Accelerating PQMRCGSTAB algorithm on GPU

Proceedings of the combined workshops on UnConventional high performance computing workshop plus memory access workshop
A Parallel Preconditioned Conjugate Gradient Solver for the Poisson Problem on a Multi-GPU Platform

PDP '10 Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Forward and back substitution algorithms are widely used for solving linear systems of equations after performing LU decomposition on the coefficient matrix. They are also essential in the implementation of high performance preconditioners which improve the convergence properties of the various iterative methods. In this paper, we describe an efficient approach to implementing forward and back substitution algorithms on a GPU and provide the implementation details of these algorithms on a Modified Incomplete Cholesky Preconditioner for the Conjugate Gradient (CG) algorithm. The resulting forward and back substitution algorithms are then used on a Modified Incomplete Cholesky Preconditioned Conjugate Gradient method to solve the sparse, symmetric, positive definite and linear systems of equations arising from the discretization of three dimensional finite difference ground-water flow models. By utilizing multiple threads, the proposed method yields speedups up to 60 times on GeForce GTX 280 compared to CPU implementation and up to 4.8 times speedup compared to cuSPARSE library function optimized for GPU by NVIDIA.