Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Loop optimization in register-transfer scheduling for DSP-systems
DAC '89 Proceedings of the 26th ACM/IEEE Design Automation Conference
Global scheduling independent of control dependencies based on condition vectors
DAC '92 Proceedings of the 29th ACM/IEEE Design Automation Conference
DAC '90 Proceedings of the 27th ACM/IEEE Design Automation Conference
Rotation scheduling: a loop pipelining algorithm
DAC '93 Proceedings of the 30th international Design Automation Conference
Resource-Constrained Software Pipelining
IEEE Transactions on Parallel and Distributed Systems
DAC '98 Proceedings of the 35th annual Design Automation Conference
Conditional speculation and its effects on performance and area for high-level snthesis
Proceedings of the 14th international symposium on Systems synthesis
Automatic Extraction of Functional Parallelism from Ordinary Programs
IEEE Transactions on Parallel and Distributed Systems
Perfect Pipelining: A New Loop Parallelization Technique
ESOP '88 Proceedings of the 2nd European Symposium on Programming
Analysis of conditional resource sharing using a guard-based control representation
ICCD '95 Proceedings of the 1995 International Conference on Computer Design: VLSI in Computers and Processors
Combining MBP-speculative computation and loop pipelining in high-level synthesis
EDTC '95 Proceedings of the 1995 European conference on Design and Test
Algorithm and Hardware Support for Branch Anticipation
GLS '97 Proceedings of the 7th Great Lakes Symposium on VLSI
MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
SPARK: A High-Lev l Synthesis Framework For Applying Parallelizing Compiler Transformations
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
An Efficient Global Resource-Directed Approach to Exploiting Instruction-Level Parallelism
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Journal of VLSI Signal Processing Systems
The impact of loop unrolling on controller delay in high level synthesis
Proceedings of the conference on Design, automation and test in Europe
Optimal Unroll Factor for Reconfigurable Architectures
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Optimal Loop Unrolling and Shifting for Reconfigurable Architectures
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Studying the code compression design space - A synthesis approach
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.00 |
Emerging embedded system applications in multimedia and image processing are characterized by complex control flow consisting of deeply nested conditionals and loops. Wepresent a technique called loop shifting that incrementally exploits loop level parallelism across iterations by shifting and compacting operations across loop iterations. Our experimental results show that loop shifting is particularly effective for the synthesis of designs with complex control especially when resource utilization is already high and/or under tight resource constraints. In situations when further loop unrolling (or initiating another iteration of the loop body) leads to a sharp increase in the longest combinational path in the circuit and the circuit area, loop shifting is able to achieve up to 20 % reduction in the input-to-output delay in the synthesized circuit. We implemented loop shifting within the SPARK parallelizing high-level synthesis framework and present results for experiments on designs derived from multimedia and image processing applications.