Vectorized sparse matrix multiply for compressed row storage format

Authors:
Eduardo F. D'Azevedo;Mark R. Fahey;Richard T. Mills
Affiliations:
Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN;Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN;Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN
Venue:
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Year:
2005

Citing 6
Cited 12

Direct methods for sparse matrices

Direct methods for sparse matrices
Numerical Linear Algebra for High Performance Computers

Numerical Linear Algebra for High Performance Computers
Solving Linear Systems on Vector and Shared Memory Computers

Solving Linear Systems on Vector and Shared Memory Computers
Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems
Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors

Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors
Sparse matrix vector multiplication techniques on the IBM 3090 VF

Parallel Computing

The potential of the cell processor for scientific computing

Proceedings of the 3rd conference on Computing frontiers
Accelerating sparse matrix computations via data compression

Proceedings of the 20th annual international conference on Supercomputing
Executing irregular scientific applications on stream architectures

Proceedings of the 21st annual international conference on Supercomputing
Scientific computing Kernels on the cell processor

International Journal of Parallel Programming
Pattern-based sparse matrix representation for memory-efficient SMVM kernels

Proceedings of the 23rd international conference on Supercomputing
Model-driven autotuning of sparse matrix-vector multiply on GPUs

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Optimizing Sparse Data Structures for Matrix-vector Multiply

International Journal of High Performance Computing Applications
Fast sparse matrix-vector multiplication by exploiting variable block structure

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Performance improvement of sparse matrix vector product on vector machines

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
A space and time efficient algorithm for SimRank computation

World Wide Web
High-performance sparse matrix-vector multiplication on GPUs for structured grid computations

Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemes

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

The innovation of this work is a simple vectorizable algorithm for performing sparse matrix vector multiply in compressed sparse row (CSR) storage format. Unlike the vectorizable jagged diagonal format (JAD), this algorithm requires no data rearrangement and can be easily adapted to a sophisticated library framework such as PETSc. Numerical experiments on the Cray X1 show an order of magnitude improvement over the non-vectorized algorithm.