On Synthesizing Optimal Family of Linear Systolic Arrays for Matrix Multiplication

  • Authors:
  • V. K. Prasanna Kumar;Yu-Chen Tsai

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 1991

Quantified Score

Hi-index 14.99

Visualization

Abstract

The authors describe a family of linear systolic arrays for matrix multiplication exhibiting a tradeoff between local storage and the number of processing elements (PEs). The design consists of processors hooked into a linear array with each processor having storage s, 1or=sor=n, for n*n matrix multiplication, where the number of processors equals n times the least integer 驴n/s. The input matrices are fed as two speed data streams using fast and slow channels to satisfy the dependencies in the usual matrix multiplication algorithm. While a family of linear arrays have been synthesized for this problem, this technique leads to simpler designs with fewer number of processors and improved delay from input to output. All these designs use the optimal number of processors for local storage in the range 1or=sor=n. The data flow is unidirectional, which makes the designs implementable on fault wafer scale integration models.