Streaming sparse matrix compression/decompression

Authors:
David Moloney;Dermot Geraghty;Colm McSweeney;Ciaran McElroy
Affiliations:
Department Of Mechanical & Manufacturing Engineering, Trinity College Dublin, Dublin 2, Ireland;Department Of Mechanical & Manufacturing Engineering, Trinity College Dublin, Dublin 2, Ireland;Department Of Mechanical & Manufacturing Engineering, Trinity College Dublin, Dublin 2, Ireland;Department Of Mechanical & Manufacturing Engineering, Trinity College Dublin, Dublin 2, Ireland
Venue:
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Year:
2005

Citing 9
Cited 1

Direct methods for sparse matrices

Direct methods for sparse matrices
Achieving high sustained performance in an unstructured mesh CFD application

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
The Matrix Template Library: Generic Components for High-Performance Scientific Computing

Computing in Science and Engineering
SPAR: A New Architecture for Large Finite Element Computations

IEEE Transactions on Computers
A Hierarchical Sparse Matrix Storage Format for Vector Processors

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Leading Zero Anticipation and Detection A Comparison of Methods

ARITH '01 Proceedings of the 15th IEEE Symposium on Computer Arithmetic
Compression of Sparse Matrices by Arithmetic Coding

DCC '98 Proceedings of the Conference on Data Compression
A Case for Studying DRAM Issues at the System Level

IEEE Micro
Using latent semantic indexing to filter spam

Proceedings of the 2003 ACM symposium on Applied computing

Exploiting compression opportunities to improve SpMxV performance on shared memory systems

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

A streaming floating-point sparse-matrix compression which forms a key element of an accelerator for finite-element and other linear algebra applications is described. The proposed architecture seeks to accelerate the key performance-limiting Sparse Matrix-Vector Multiplication (SMVM) operation at the heart of finite-element applications through a combination of a dedicated datapath optimized for these applications with a streaming data-compression and decompression unit which increases the effective memory bandwidth seen by the datapath. The proposed format uses variable length entries which contain an opcode and optionally an address and/or non-zero entry. System simulations performed using a cycle-accurate C++ architectural model and a database of over 400 large symmetric and unsymmetric matrices containing up to 20M non-zero elements (and a total of 226M non-zeroes) demonstrate that a 20% average effective memory bandwidth performance improvement can be achieved using the proposed architecture compared with published work, for a modest increase in hardware resources.