Force-directed scheduling in automatic data path synthesis
DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
High-level synthesis: introduction to chip and system design
High-level synthesis: introduction to chip and system design
Performance analysis and optimization of schedules for conditional and loop-intensive specifications
DAC '94 Proceedings of the 31st annual Design Automation Conference
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Wavesched: a novel scheduling technique for control-flow intensive behavioral descriptions
ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
Exploiting off-chip memory access modes in high-level synthesis
ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
DAC '98 Proceedings of the 35th annual Design Automation Conference
Advanced compiler design and implementation
Advanced compiler design and implementation
Estimation of lower bounds in scheduling algorithms for high-level synthesis
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Performance-constrained pipelining of software loops onto reconfigurable hardware
FPGA '02 Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
Efficient scheduling of conditional behaviors for high-level synthesis
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Introduction to the Scheduling Problem
IEEE Design & Test
Hierarchical Scheduling in High Level Synthesis Using Resource Sharing Across Nested Loops
GLS '99 Proceedings of the Ninth Great Lakes Symposium on VLSI
SPARK: A High-Lev l Synthesis Framework For Applying Parallelizing Compiler Transformations
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
Coordinated parallelizing compiler optimizations and high-level synthesis
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Towards a Source Level Compiler: Source Level Modulo Scheduling
ICPPW '06 Proceedings of the 2006 International Conference Workshops on Parallel Processing
Increasing hardware efficiency with multifunction loop accelerators
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Compiling code accelerators for FPGAs
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Structural operational semantics for supporting multi-cycle operations in RTL HDLs
MEMOCODE '05 Proceedings of the 2nd ACM/IEEE International Conference on Formal Methods and Models for Co-Design
A design flow dedicated to multi-mode architectures for DSP applications
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Pattern-based behavior synthesis for FPGA resource reduction
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
A compiler approach to managing storage and memory bandwidth in configurable architectures
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Path-based scheduling for synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A formal approach to the scheduling problem in high level synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
In this work we consider a special optimization problem involved with compiling compound loops (combining nested and consecutive sub-loops) to Verilog. Each sub-loop of the compound loop may require a different optimized hardware configuration (OHC) for optimized execution times. For example, one loop requires at least two memory ports and one multiplier for an optimized execution time, while another loop may require only one memory port but two multipliers, yet one OHC should be selected for both loops. The goal is to compute a minimal OHC which, based on the different heat levels (expected number of iterations) of the sub-loops, is a good compromise between all the conflicting requirements of each sub-loop. Though synthesis of nested loops has been implemented in quite a few systems this aspect has not been considered so far. We avoid the use of time consuming integer linear programming (ILP) techniques and instead use a fast space exploration technique combined with an efficient variant of list scheduling. Another novel aspect of the proposed system is the observation that the real latencies of the hardware units should be considered as variables of the OHC rather than fixed real values as is usually done in high-level synthesis systems. Experimental results show a significant improvement in the OHC without a significant increase in the execution time due to the use of this search procedure.