Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY

  • Authors:
  • Eun-Jin Im;Katherine A. Yelick

  • Affiliations:
  • -;-

  • Venue:
  • ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sparse matrix-vector multiplication is an important computational kernel that tends to perform poorly on modern processors, largely because of its high ratio of memory operations to arithmetic operations. Optimizing this algorithm is difficult, both because of the complexity of memory systems and because the performance is highly dependent on the nonzero structure of the matrix. The Sparsity system is designed to address these problem by allowing users to automatically build sparse matrix kernels that are tuned to their matrices and machines. The most difficult aspect of optimizing these algorithms is selecting among a large set of possible transformations and choosing parameters, such as block size. In this paper we discuss the optimization of two operations: a sparse matrix times a dense vector and a sparse matrix times a set of dense vectors. Our experience indicates that for matrices arising in scientific simulations, register level optimizations are critical, and we focus here on the optimizations and parameter selection techniques used in Sparsity for register-level optimizations. We demonstrate speedups of up to 2脳 for the single vector case and 5脳 for the multiple vector case.