The BiConjugate gradient method on GPUs

Authors:
G. Ortega;E. M. Garzón;F. Vázquez;I. García
Affiliations:
Dpt. of Comput. Archit. and Electron., Univ. of Almería, Almería, Spain 04120;Dpt. of Comput. Archit. and Electron., Univ. of Almería, Almería, Spain 04120;Dpt. of Comput. Archit. and Electron., Univ. of Almería, Almería, Spain 04120;Dpt. of Comput. Archit., Univ. of Málaga, Málaga, Spain 29071
Venue:
The Journal of Supercomputing
Year:
2013

Citing 11
Cited 0

Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems
Parallel Scientific Computation: A Structured Approach Using BSP and MPI

Parallel Scientific Computation: A Structured Approach Using BSP and MPI
Implementing sparse matrix-vector multiplication on throughput-oriented processors

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Model-driven autotuning of sparse matrix-vector multiply on GPUs

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Parallel Iterative Linear Solvers on GPU: A Financial Engineering Case

PDP '10 Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU

Proceedings of the 37th annual international symposium on Computer architecture
Improving the Performance of the Sparse Matrix Vector Product with GPUs

CIT '10 Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology
A new approach for sparse matrix vector product on NVIDIA GPUs

Concurrency and Computation: Practice & Experience
Iterative solution of linear systems in electromagnetics (and not only): experiences with CUDA

Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Matrix Implementation of Simultaneous Iterative Reconstruction Technique (SIRT) on GPUs

The Computer Journal
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a wide variety of applications from different scientific and engineering fields, the solution of complex and/or nonsymmetric linear systems of equations is required. To solve this kind of linear systems the BiConjugate Gradient method (BCG) is especially relevant. Nevertheless, BCG has a enormous computational cost. GPU computing is useful for accelerating this kind of algorithms but it is necessary to develop suitable implementations to optimally exploit the GPU architecture. In this paper, we show how BCG can be effectively accelerated when all operations are computed on a GPU. So, BCG has been implemented with two alternative routines of the Sparse Matrix Vector product (SpMV): the CUSPARSE library and the ELLR-T routine. Although our interest is focused on complex matrices, our implementation has been evaluated on a GPU for two sets of test matrices: complex and real, in single and double precision data. Experimental results show that BCG based on ELLR-T routine achieves the best performance, particularly for the set of complex test matrices. Consequently, this method can be useful as a tool to efficiently solve large linear system of equations (complex and/or nonsymmetric) involved in a broad range of applications.