Floating-point sparse matrix-vector multiply for FPGAs

  • Authors:
  • Michael deLorimier;André DeHon

  • Affiliations:
  • California Institute of Technology, Pasadena, CA;California Institute of Technology, Pasadena, CA

  • Venue:
  • Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large, high density FPGAs with high local distributed memory bandwidth surpass the peak floating-point performance of high-end, general-purpose processors. Microprocessors do not deliver near their peak floating-point performance on efficient algorithms that use the Sparse Matrix-Vector Multiply (SMVM) kernel. In fact, it is not uncommon for microprocessors to yield only 10--20% of their peak floating-point performance when computing SMVM. We develop and analyze a scalable SMVM implementation on modern FPGAs and show that it can sustain high throughput, near peak, floating-point performance. For benchmark matrices from the Matrix Market Suite we project 1.5 double precision Gflops/FPGA for a single Virtex II 6000-4 and 12 double precision Gflops for 16 Virtex IIs (750Mflops/FPGA).