Cathedral-III: Architecture-driven high-level synthesis for high throughput DSP applications
DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
Fast and near optimal scheduling in automatic data path synthesis
DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
Lifetime-sensitive modulo scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Minimum register requirements for a modulo schedule
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Minimizing register requirements under resource-constrained rate-optimal software pipelining
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Data-parallel C on a reconfigurable logic array
The Journal of Supercomputing - Special issue on field programmable gate arrays
Optimum modulo schedules for minimum register requirements
ICS '95 Proceedings of the 9th international conference on Supercomputing
Stage scheduling: a technique to reduce the register requirements of a modulo schedule
Proceedings of the 28th annual international symposium on Microarchitecture
Component selection for high-performance pipelines
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Efficient formulation for optimal modulo schedulers
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Optimal Modulo Scheduling Through Enumeration
International Journal of Parallel Programming
PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators
Journal of VLSI Signal Processing Systems
A system for synthesizing optimized FPGA hardware from MATLAB
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
The Garp Architecture and C Compiler
Computer
DEFACTO: A Design Environment for Adaptive Computing Technology
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Reduced code size modulo scheduling in the absence of hardware support
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Global resource sharing for synthesis of control data flow graphs on FPGAs
Proceedings of the 40th annual Design Automation Conference
NAPA C: Compiling for a Hybrid RISC/FPGA Architecture
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
Parallelizing Applications into Silicon
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Facet: A procedure for the automated synthesis of digital systems
DAC '83 Proceedings of the 20th Design Automation Conference
Cameron: High Level Language Compilation for Reconfigurable Systems
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Swing Modulo Scheduling: A Lifetime-Sensitive Approach
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Bitwidth cognizant architecture synthesis of custom hardware accelerators
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Streamroller:: automatic synthesis of prescribed throughput accelerator pipelines
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Increasing hardware efficiency with multifunction loop accelerators
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Modulo scheduling for highly customized datapaths to increase hardware reusability
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
VEAL: Virtualized Execution Accelerator for Loops
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
DVFS in loop accelerators using BLADES
Proceedings of the 45th annual Design Automation Conference
Playing the trade-off game: Architecture exploration using Coffeee
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Finding the best compromise in compiling compound loops to Verilog
Journal of Systems Architecture: the EUROMICRO Journal
Integrated Code Generation for Loops
ACM Transactions on Embedded Computing Systems (TECS)
Loop acceleration exploration for ASIP architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
SDC-based modulo scheduling for pipeline synthesis
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
Scheduling algorithms used in compilers traditionally focus on goals such as reducing schedule length and register pressure or producing compact code. In the context of a hardware synthesis system where the schedule is used to determine various components of the hardware, including datapath, storage, and interconnect, the goals of a scheduler change drastically. In addition to achieving the traditional goals, the scheduler must proactively make decisions to ensure efficient hardware is produced. This paper proposes two exact solutions for cost sensitive modulo scheduling, one based on an integer linear programming formulation and another based on branch-and-bound search. To achieve reasonable compilation times, decomposition techniques to break down the complex scheduling problem into phase ordered sub-problems are proposed. The decomposition techniques work either by partitioning the dataflow graph into smaller subgraphs and optimally scheduling the subgraphs, or by splitting the scheduling problem into two phases, time slot and resource assignment. The effectiveness of cost sensitive modulo scheduling in minimizing the costs of function units, register structures, and interconnection wires are evaluated within a fully automatic synthesis system for loop accelerators. The cost sensitive modulo scheduler increases the efficiency of the resulting hardware significantly compared to both traditional cost unaware and greedy cost aware modulo schedulers.