Research on the conjugate gradient algorithm with a modified incomplete Cholesky preconditioner on GPU

  • Authors:
  • Jiaquan Gao;Ronghua Liang;Jun Wang

  • Affiliations:
  • -;-;-

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this study, we discover the parallelism of the forward/backward substitutions (FBS) for two cases and thus propose an efficient preconditioned conjugate gradient algorithm with the modified incomplete Cholesky preconditioner on the GPU (GPUMICPCGA). For our proposed GPUMICPCGA, the following are distinct characteristics: (1) the vector operations are optimized by grouping several vector operations into single kernels, (2) a new kernel of inner product and a new kernel of the sparse matrix-vector multiplication with high optimization are presented, and (3) an efficient parallel implementation of FBS on the GPU (GPUFBS) for two cases are suggested. Numerical results show that our proposed kernels outperform the corresponding ones presented in CUBLAS or CUSPARSE, and GPUFBS is almost 3 times faster than the implementation of FBS using the CUSPARSE library. Furthermore, GPUMICPCGA has better behavior than its counterpart implemented by the CUBLAS and CUSPARSE libraries.