CSX: an extended compression format for spmv on shared memory systems

Authors:
Kornilios Kourtis;Vasileios Karakasis;Georgios Goumas;Nectarios Koziris
Affiliations:
National Technical University of Athens, Zografou, Greece;National Technical University of Athens, Zografou, Greece;National Technical University of Athens, Zografou, Greece;National Technical University of Athens, Zografou, Greece
Venue:
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Year:
2011

Citing 19
Cited 5

A high performance algorithm using pre-processing for the sparse matrix-vector multiplication

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Improving the memory-system performance of sparse-matrix vector multiplication

IBM Journal of Research and Development
Improving performance of sparse matrix-vector multiplication

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Achieving high sustained performance in an unstructured mesh CFD application

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY

ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Decomposing Irregularly Sparse Matrices for Parallel Matrix-Vector Multiplication

IRREGULAR '96 Proceedings of the Third International Workshop on Parallel Algorithms for Irregularly Structured Problems
Performance optimizations and bounds for sparse matrix-vector multiply

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems
On Improving the Performance of Sparse Matrix-Vector Multiplication

HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Optimizing Sparse Matrix-Vector Product Computations Using Unroll and Jam

International Journal of High Performance Computing Applications
Accelerating sparse matrix computations via data compression

Proceedings of the 20th annual international conference on Supercomputing
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Optimizing sparse matrix-vector multiplication using index and value compression

Proceedings of the 5th conference on Computing frontiers
Improving the Performance of Multithreaded Sparse Matrix-Vector Multiplication Using Index and Value Compression

ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Pattern-based sparse matrix representation for memory-efficient SMVM kernels

Proceedings of the 23rd international conference on Supercomputing
A Comparative Study of Blocking Storage Methods for Sparse Matrices on Multicore Architectures

CSE '09 Proceedings of the 2009 International Conference on Computational Science and Engineering - Volume 01
Performance evaluation of the sparse matrix-vector multiplication on modern architectures

The Journal of Supercomputing
Fast sparse matrix-vector multiplication by exploiting variable block structure

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications

Adapt or become extinct!: the case for a unified framework for deployment-time optimization (position paper)

Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Sparse matrix-vector multiply on the HICAMP architecture

Proceedings of the 26th ACM international conference on Supercomputing
Using state-of-the-art sparse matrix optimizations for accelerating the performance of multiphysics simulations

PARA'12 Proceedings of the 11th international conference on Applied Parallel and Scientific Computing
SMAT: an input adaptive auto-tuner for sparse matrix-vector multiplication

Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
yaSpMV: yet another SpMV framework on GPUs

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Sparse Matrix-Vector multiplication (SpMV) kernel scales poorly on shared memory systems with multiple processing units due to the streaming nature of its data access pattern. Previous research has demonstrated that an effective strategy to improve the kernel's performance is to drastically reduce the data volume involved in the computations. Since the storage formats for sparse matrices include metadata describing the structure of non-zero elements within the matrix, we propose a generalized approach to compress metadata by exploiting substructures within the matrix. We call the proposed storage format Compressed Sparse eXtended (CSX). In our implementation we employ runtime code generation to construct specialized SpMV routines for each matrix. Experimental evaluation on two shared memory systems for 15 sparse matrices demonstrates significant performance gains as the number of participating cores increases. Regarding the cost of CSX construction, we propose several strategies which trade performance for preprocessing cost making CSX applicable both to online and offline preprocessing.