The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
Knowledge based control in micro-architecture design
DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
Parallel controller synthesis from a Petri net specification
EURO-DAC '94 Proceedings of the conference on European design automation
Automatic Extraction of Functional Parallelism from Ordinary Programs
IEEE Transactions on Parallel and Distributed Systems
Decomposition and factorization of sequential finite state machines
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Wavesched: a novel scheduling technique for control-flow intensive designs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A Distributed Controller for Managing Speculative Functional Units in High Level Synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
Nowadays, implementing hardware accelerators by hand-writing the RTL still leads to better quality of the results with respect to those obtained by automating the design process. Manually developing and maintaining hardware designs, however, is a complex, time-consuming and error prone task, making improvements in the automatic design flow definition a fervent ongoing research topic. The most common approach is based on a statically computed scheduling order. Supports for features such as dynamic scheduling or unbounded latency of operations and functional units have been proposed with some limitations. Instructions auto-scheduling is an alternative to overcome such restrictions, while facing those situations that need or take advantage of run-time adaptive reordering of the instructions. This paper focuses on how to improve the synthesis of hardware cores by increasing automatic parallelism exploitation. The proposed approach computes the set of conditions to be satisfied for each instruction to be executed as soon as possible, allowing run-time auto-scheduling. Representing such conditions as logic functions, the corresponding hardware implementation can be easily automated. Experimental results have shown an encouraging enhancement in terms of performance, with a limited increase of area.