Improving the memory-system performance of sparse-matrix vector multiplication
IBM Journal of Research and Development
Improving performance of sparse matrix-vector multiplication
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Introduction to Algorithms
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Optimizing the performance of sparse matrix-vector multiplication
Optimizing the performance of sparse matrix-vector multiplication
Automatic performance tuning of sparse matrix kernels
Automatic performance tuning of sparse matrix kernels
Block Wiedemann algorithm on multicore architectures
ACM Communications in Computer Algebra
Hi-index | 0.00 |
Sparse matrix-vector multiplication or SpMXV is an important kernel in scientific computing. For example, the conjugate gradient method (CG) is an iterative linear system solving process where multiplication of the coefficient matrix A with a dense vector x is the main computational step accounting for as much as 90% of the overall running time. Though the total number of arithmetic operations (involving nonzero entries only) to compute Ax is fixed, reducing the probability of cache misses per operation is still a challenging area of research. This preprocessing is done once and its cost is amortized by repeated multiplications. Computers that employ cache memory to improve the speed of data access rely on reuse of data that are brought into the cache memory. The challenge is to exploit data locality especially for unstructured problems: modeling data locality in this context is hard.