An Approximate Minimum Degree Ordering Algorithm
SIAM Journal on Matrix Analysis and Applications
Improving performance of sparse matrix-vector multiplication
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Computer Solution of Large Sparse Positive Definite
Computer Solution of Large Sparse Positive Definite
Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY
ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
Optimizing the performance of sparse matrix-vector multiplication
Optimizing the performance of sparse matrix-vector multiplication
Optimizing Sparse Matrix-Vector Product Computations Using Unroll and Jam
International Journal of High Performance Computing Applications
Accelerating sparse matrix computations via data compression
Proceedings of the 20th annual international conference on Supercomputing
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Optimizing sparse matrix-vector multiplication using index and value compression
Proceedings of the 5th conference on Computing frontiers
Pattern-based sparse matrix representation for memory-efficient SMVM kernels
Proceedings of the 23rd international conference on Supercomputing
Implementing sparse matrix-vector multiplication on throughput-oriented processors
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Model-driven autotuning of sparse matrix-vector multiply on GPUs
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
A new approach for sparse matrix vector product on NVIDIA GPUs
Concurrency and Computation: Practice & Experience
The university of Florida sparse matrix collection
ACM Transactions on Mathematical Software (TOMS)
Fast sparse matrix-vector multiplication by exploiting variable block structure
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Vectorized sparse matrix multiply for compressed row storage format
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
High-performance sparse matrix-vector multiplication on GPUs for structured grid computations
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Automatically tuning sparse matrix-vector multiplication for GPU architectures
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs
Proceedings of the 26th ACM international conference on Supercomputing
yaSpMV: yet another SpMV framework on GPUs
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
The sparse matrix-vector (SpMV) multiplication routine is an important building block used in many iterative algorithms for solving scientific and engineering problems. One of the main challenges of SpMV is its memory-boundedness. Although compression has been proposed previously to improve SpMV performance on CPUs, its use has not been demonstrated on the GPU because of the serial nature of many compression and decompression schemes. In this paper, we introduce a family of bit-representation-optimized (BRO) compression schemes for representing sparse matrices on GPUs. The proposed schemes, BRO-ELL, BRO-COO, and BRO-HYB, perform compression on index data and help to speed up SpMV on GPUs through reduction of memory traffic. Furthermore, we formulate a BRO-aware matrix reordering scheme as a data clustering problem and use it to increase compression ratios. With the proposed schemes, experiments show that average speedups of 1.5x compared to ELLPACK and HYB can be achieved for SpMV on GPUs.