Sparse matrix-vector multiply on the HICAMP architecture

  • Authors:
  • John P. Stevenson;Amin Firoozshahian;Alex Solomatnikov;Mark Horowitz;David Cheriton

  • Affiliations:
  • Stanford University, Palo Alto, CA, USA;HICAMP Systems, Menlo Park, CA, USA;HICAMP Systems, Menlo Park, CA, USA;Stanford University, Palo Alto, CA, USA;Stanford University & HICAMP Systems, Palo Alto, CA, USA

  • Venue:
  • Proceedings of the 26th ACM international conference on Supercomputing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sparse matrix-vector multiply (SpMV) is a critical task in the inner loop of modern iterative linear system solvers and exhibits very little data reuse. This low reuse means that its performance is bounded by main-memory bandwidth. Moreover, the random patterns of indirection make it difficult to achieve this bound. We present sparse matrix storage formats based on deduplicated memory. These formats reduce memory traffic during SpMV and thus show significantly improved performance bounds: 90x better in the best case. Additionally, we introduce a matrix format that inherently exploits any amount of matrix symmetry and is at the same time fully compatible with non-symmetric matrix code. Because of this, our method can concurrently operate on a symmetric matrix without complicated work partitioning schemes and without any thread synchronization or locking. This approach takes advantage of growing processor caches, but incurs an instruction count overhead. It is feasible to overcome this issue by using specialized hardware as shown by the recently proposed Hierarchical Immutable Content-Addressable Memory Processor, or HICAMP architecture.