Loop Shifting and Compaction for the High-Level Synthesis of Designs with Complex Control Flow

Authors:
Sumit Gupta;Nikil Dutt;Rajesh Gupta;Alexandru Nicolau
Affiliations:
-;-;-;-
Venue:
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Year:
2004

Citing 16
Cited 6

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Loop optimization in register-transfer scheduling for DSP-systems

DAC '89 Proceedings of the 26th ACM/IEEE Design Automation Conference
Global scheduling independent of control dependencies based on condition vectors

DAC '92 Proceedings of the 29th ACM/IEEE Design Automation Conference
Percolation based synthesis

DAC '90 Proceedings of the 27th ACM/IEEE Design Automation Conference
Rotation scheduling: a loop pipelining algorithm

DAC '93 Proceedings of the 30th international Design Automation Conference
Resource-Constrained Software Pipelining

IEEE Transactions on Parallel and Distributed Systems
Incorporating speculative execution into scheduling of control-flow intensive behavioral descriptions

DAC '98 Proceedings of the 35th annual Design Automation Conference
Conditional speculation and its effects on performance and area for high-level snthesis

Proceedings of the 14th international symposium on Systems synthesis
Automatic Extraction of Functional Parallelism from Ordinary Programs

IEEE Transactions on Parallel and Distributed Systems
Perfect Pipelining: A New Loop Parallelization Technique

ESOP '88 Proceedings of the 2nd European Symposium on Programming
Analysis of conditional resource sharing using a guard-based control representation

ICCD '95 Proceedings of the 1995 International Conference on Computer Design: VLSI in Computers and Processors
Combining MBP-speculative computation and loop pipelining in high-level synthesis

EDTC '95 Proceedings of the 1995 European conference on Design and Test
Algorithm and Hardware Support for Branch Anticipation

GLS '97 Proceedings of the 7th Great Lakes Symposium on VLSI
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
SPARK: A High-Lev l Synthesis Framework For Applying Parallelizing Compiler Transformations

VLSID '03 Proceedings of the 16th International Conference on VLSI Design
An Efficient Global Resource-Directed Approach to Exploiting Instruction-Level Parallelism

PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques

Scheduling of Iterative Algorithms with Matrix Operations for Efficient FPGA Design--Implementation of Finite Interval Constant Modulus Algorithm

Journal of VLSI Signal Processing Systems
The impact of loop unrolling on controller delay in high level synthesis

Proceedings of the conference on Design, automation and test in Europe
Optimal Unroll Factor for Reconfigurable Architectures

ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Optimal Loop Unrolling and Shifting for Reconfigurable Architectures

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Modern development methods and tools for embedded reconfigurable systems: A survey

Integration, the VLSI Journal
Studying the code compression design space - A synthesis approach

Journal of Systems Architecture: the EUROMICRO Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Emerging embedded system applications in multimedia and image processing are characterized by complex control flow consisting of deeply nested conditionals and loops. Wepresent a technique called loop shifting that incrementally exploits loop level parallelism across iterations by shifting and compacting operations across loop iterations. Our experimental results show that loop shifting is particularly effective for the synthesis of designs with complex control especially when resource utilization is already high and/or under tight resource constraints. In situations when further loop unrolling (or initiating another iteration of the loop body) leads to a sharp increase in the longest combinational path in the circuit and the circuit area, loop shifting is able to achieve up to 20 % reduction in the input-to-output delay in the synthesized circuit. We implemented loop shifting within the SPARK parallelizing high-level synthesis framework and present results for experiments on designs derived from multimedia and image processing applications.