Improving the Performance of the Sparse Matrix Vector Product with GPUs

Authors:
F. Vazquez;G. Ortega;J. J. Fernandez;E. M. Garzon
Affiliations:
-;-;-;-
Venue:
CIT '10 Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology
Year:
2010

Citing 0
Cited 11

Fast Parallel Markov Clustering in Bioinformatics Using Massively Parallel Computing on GPU with CUDA and ELLPACK-R Sparse Format

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Design patterns for scientific computations on sparse matrices

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing
clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs

Proceedings of the 26th ACM international conference on Supercomputing
Automatic tuning of the sparse matrix vector product on GPUs based on the ELLR-T approach

Parallel Computing
GPU-based parallel algorithms for sparse nonlinear systems

Journal of Parallel and Distributed Computing
GPU acceleration of the matrix-free interior point method

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
The BiConjugate gradient method on GPUs

The Journal of Supercomputing
Development of a unified FDTD-FEM library for electromagnetic analysis with CPU and GPU computing

The Journal of Supercomputing
CUDA-enabled Sparse Matrix-Vector Multiplication on GPUs using atomic operations

Parallel Computing
Semi-sparse algorithm based on multi-layer optimization for recommender system

The Journal of Supercomputing
Design patterns for sparse-matrix computations on hybrid CPU/GPU platforms

Scientific Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sparse matrices are involved in linear systems, eigensystems and partial differential equations from a wide spectrum of scientific and engineering disciplines. Hence, sparse matrix vector product (SpMV) is considered as key operation in engineering and scientific computing. For these applications the optimization of the sparse matrix vector product (SpMV) is very relevant. However, the irregular computation involved in SpMV prevents the optimum exploitation of computational architectures when the sparse matrices are very large. Graphics Processing Units (GPUs) have recently emerged as platforms that yield outstanding acceleration factors. SpMV implementations for GPUs have already appeared on the scene. This work proposes and evaluates new implementations of SpMV for GPUs called ELLR-T. They are based on the format ELLPACK-R, which allows storage of the sparse matrix in a regular manner. A comparative evaluation against a variety of storage formats previously proposed has been carried out based on a representative set of test matrices. The results show that: (1) the SpMV is highly accelerated with GPUs; (2) the performance strongly depends on the specific pattern of the matrix; and (3) the implementations ELLR-T achieve higher overall performance. Consequently, the new implementations of SpMV, ELLR-T, described in this paper can help to exploit the GPUs, because, they achieve high performance and they can be easily joined in the engineering and scientific computing.