Optimizing Sparse Data Structures for Matrix-vector Multiply

Authors:
D. Guo;W. Gropp
Affiliations:
National Center for Supercomputing Applications, Universityof Illinois at Urbana-Champaign, IL, USA;National Center for Supercomputing Applications, Universityof Illinois at Urbana-Champaign, IL, USA
Venue:
International Journal of High Performance Computing Applications
Year:
2011

Citing 15
Cited 2

Data distributions for sparse matrix vector multiplication

Parallel Computing
Improving performance of sparse matrix-vector multiplication

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Achieving high sustained performance in an unstructured mesh CFD application

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Performance optimizations and bounds for sparse matrix-vector multiply

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Optimizing the performance of sparse matrix-vector multiplication

Optimizing the performance of sparse matrix-vector multiplication
Automatic performance tuning of sparse matrix kernels

Automatic performance tuning of sparse matrix kernels
Optimizing Sparse Matrix-Vector Product Computations Using Unroll and Jam

International Journal of High Performance Computing Applications
POWER5 System microarchitecture

IBM Journal of Research and Development - POWER5 and packaging
Performance Optimization and Modeling of Blocked Sparse Kernels

International Journal of High Performance Computing Applications
IBM POWER6 microarchitecture

IBM Journal of Research and Development
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

Parallel Computing
POWER3: the next generation of PowerPC processors

IBM Journal of Research and Development
POWER4 system microarchitecture

IBM Journal of Research and Development
The university of Florida sparse matrix collection

ACM Transactions on Mathematical Software (TOMS)
Vectorized sparse matrix multiply for compressed row storage format

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I

Exploiting dense substructures for fast sparse matrix vector multiplication

International Journal of High Performance Computing Applications
Applications of the streamed storage format for sparse matrix operations

International Journal of High Performance Computing Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sparse matrixâ聙聰vector multiply is an important operation in a wide range of problems. One of the key factors determining the performance of this operation is sustained memory bandwidth. In the IBM POWER architecture, there is a hardware component called a prefetch data stream that can significantly increase sustained memory bandwidth. We have developed a new family of storage formats for sparse matrices that exploits this capability. Test results show that our new streamed storage formats can significantly improve the performance of sparse matrix and vector multiply on IBM POWER processors, compared to traditional compressed sparse row and block compressed sparse row formats. The new formats also provide a benefit on x86 processors.