Implementing Blocked Sparse Matrix-Vector Multiplication on NVIDIA GPUs

  • Authors:
  • Alexander Monakov;Arutyun Avetisyan

  • Affiliations:
  • Institute for System Programming of RAS, Moscow, Russia;Institute for System Programming of RAS, Moscow, Russia

  • Venue:
  • SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We discuss implementing blocked sparse matrix-vector multiplication for NVIDIA GPUs. We outline an algorithm and various optimizations, and identify potential future improvements and challenging tasks. In comparison with previously published implementation, our implementation is faster on matrices having many high fill-ratio blocks but slower on matrices with low number of non-zero elements per row.