Design Tradeoffs for BLAS Operations on Reconfigurable Hardware

  • Authors:
  • Affiliations:
  • Venue:
  • ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Numerical linear algebra operations are key primitives in scientific computing. Performance optimizations of such operations have been extensively investigated and some basic operations have been implemented as software libraries. With the rapid advances in technology, hardware acceleration of linear algebra applications using FPGAs (Field Programmable-Gate Arrays) has become feasible. In this paper, we propose FPGA-based designs for several BLAS operations, including vector product, matrix-vector multiply, and matrix multiply. By identifying the design parameters for each BLAS operation, we analyze the design tradeoffs. In the implementations of the designs, the values of the design parameters are determined according to the hardware constraints, such as the available area, the size of on-chip memory, the external memory bandwidth and the number of I/O pins. The proposed designs are implemented on a Xilinx Virtex-II Pro FPGA.