Two-dimensional cache-oblivious sparse matrix-vector multiplication

Authors:
A. N. Yzelman;Rob H. Bisseling
Affiliations:
Mathematical Institute, Utrecht University, P.O. Box 80010, 3508 TA Utrecht, The Netherlands;Mathematical Institute, Utrecht University, P.O. Box 80010, 3508 TA Utrecht, The Netherlands
Venue:
Parallel Computing
Year:
2011

Citing 12
Cited 1

Improving the memory-system performance of sparse-matrix vector multiplication

IBM Journal of Research and Development
Hypergraph-Partitioning-Based Decomposition for Parallel Sparse-Matrix Vector Multiplication

IEEE Transactions on Parallel and Distributed Systems
Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY

ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Cache-Oblivious Algorithms

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
A Two-Dimensional Data Distribution Method for Parallel Sparse Matrix-Vector Multiplication

SIAM Review
When cache blocking of sparse matrix vector multiply works and why

Applicable Algebra in Engineering, Communication and Computing
Analyzing block locality in Morton-order and Morton-hybrid matrices

ACM SIGARCH Computer Architecture News
A Hilbert-order multiplication scheme for unstructured sparse matrices

International Journal of Parallel, Emergent and Distributed Systems
Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Cache-Oblivious Sparse Matrix-Vector Multiplication by Using Sparse Matrix Partitioning Methods

SIAM Journal on Scientific Computing
The university of Florida sparse matrix collection

ACM Transactions on Mathematical Software (TOMS)

Fast Recommendation on Bibliographic Networks

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In earlier work, we presented a one-dimensional cache-oblivious sparse matrix-vector (SpMV) multiplication scheme which has its roots in one-dimensional sparse matrix partitioning. Partitioning is often used in distributed-memory parallel computing for the SpMV multiplication, an important kernel in many applications. A logical extension is to move towards using a two-dimensional partitioning. In this paper, we present our research in this direction, extending the one-dimensional method for cache-oblivious SpMV multiplication to two dimensions, while still allowing only row and column permutations on the sparse input matrix. This extension requires a generalisation of the compressed row storage data structure to a block-based data structure, for which several variants are investigated. Experiments performed on three different architectures show further improvements of the two-dimensional method compared to the one-dimensional method, especially in those cases where the one-dimensional method already provided significant gains. The largest gain obtained by our new reordering is over a factor of 3 in SpMV speed, compared to the natural matrix ordering.