Efficient Pipelining of Nested Loops: Unroll-and-Squash

  • Authors:
  • Darin Petkov;Randolph E. Harr;Saman P. Amarasinghe

  • Affiliations:
  • -;-;-

  • Venue:
  • IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The size and complexity of current custom VLSI have forced the use of high-level programming languages to describe hardware, and compiler and synthesis technology to map abstract designs into silicon. Since streaming data processing in DSP applications is typically described by loop constructs in a high-level language, loops are the most critical portions of the hardware description and special techniques are developed to optimally synthesize them. In this paper, we introduce a new method for mapping and pipelining nested loops efficiently into hardware. It achieves fine-grain parallelism even on strong intra- and inter-iteration data-dependent inner loops and, by sharing resources economically, improves performance at the expense of a small amount of additional area. We implemented the transformation within the Nimble Compiler environment and evaluated its performance on several signal-processing benchmarks. The method achieves up to 2x improvement in the area efficiency compared to the best known optimization techniques.