A Unifying Lattice-Based Approach for the Partitioning of Systolic Arrays via LPGS and LSGP

  • Authors:
  • Karl-Heinz Zimmermann

  • Affiliations:
  • Department of Electrical and Computer Engineering, Technical University Hamburg-Harburg, 21071 Hamburg, Germany

  • Venue:
  • Journal of VLSI Signal Processing Systems
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

Various methods for the synthesis of systolic arrays from signal andimage processing algorithms have been developed in the past fewyears. In this paper, we propose a technique for the partitioningproblem, the problem to synthesize systolic arrays whose size doesnot match the problem size. Our technique generalizes most of theknown lattice-based approaches to the partitioning problem andcombines the multiprojection method for the synthesis of systolicarrays with the locally sequential-globally parallel (LSGP) andlocally parallel-globally sequential (LPGS) partitioning schemes.Starting from (1) a k-dimensional large-size systolicarray obtained from a system of n-dimensional uniformrecurrences by a space-time transformation and (2) an arbitrarylattice in k-space inducing a partitioning of the arrayinto subarrays, a small-size systolic array with a scalar-valuedsystem clock is constructed via the LSGP or LPGS paradigm. Inparticular, the allocation function for the small-size array can bewritten in closed form and the timing function is obtained fromtiming functions for the subdomains, the set of operations performedby the subarrays, by simple greedy algorithms. In this way, theproblem of finding optimal timing functions can in various cases bereduced to finding optimal timing functions for the subdomains. Forproblems of large size, these greedy algorithms seem to be preferablewhen compared with existing integer or non-convex programmingformulations for finding (sub-)optimal timing functions. We alsoprovide some new results, a necessary and sufficient condition forthe existence of counter data flow, a formal relationship betweenpartitionings of processor space and index space of the uniformrecurrences in terms of counter data flow, and the structuralequivalence between the lattice-based LSGP and LPGS schemes appliedto the partitioning of index and processor space.