Algorithm 784: GEMM-based level 3 BLAS: portability and optimization issues

  • Authors:
  • Bo Kågström;Charles van Loan

  • Affiliations:
  • Umeå Univ., Umeå, Sweden;Cornell Univ., Ithaca, NY

  • Venue:
  • ACM Transactions on Mathematical Software (TOMS)
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

This companion article discusses portability and optimization issues of the GEMM-based level 3 BLAS model implementations and the performance evaluation benchmark. All software comes in all four data types (single- and double-precision, real and complex) and are designed to be easy to implement and use on different platforms. Each of the GEMM-based routines has a few machine-dependent parameters that specify internal block sizes, cache characteristics, and branch points for alternative code sections. These parameters provide means for adjustment to the characteristics of a memory hierarchy.