Cathedral-III: Architecture-driven high-level synthesis for high throughput DSP applications
DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
The ALPHA language and its use for the design of systolic arrays
Journal of VLSI Signal Processing Systems - Special issue: algorithms and parallel VSLI architecture
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
The system architect's workbench
DAC '88 Proceedings of the 25th ACM/IEEE Design Automation Conference
HERCULES—a system for high-level synthesis
DAC '88 Proceedings of the 25th ACM/IEEE Design Automation Conference
Design process model in the Yorktown Silicon Compiler
DAC '88 Proceedings of the 25th ACM/IEEE Design Automation Conference
Allocation and scheduling of conditional task graph in hardware/software co-synthesis
Proceedings of the conference on Design, automation and test in Europe
PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators
Journal of VLSI Signal Processing Systems
The MIMOLA design system: Detailed description of the software system
DAC '79 Proceedings of the 16th Design Automation Conference
Synthesis of Application Specific Multiprocessor Architectures for Process Networks
VLSID '04 Proceedings of the 17th International Conference on VLSI Design
System Design Using Kahn Process Networks: The Compaan/Laura Approach
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Increasing hardware efficiency with multifunction loop accelerators
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
MPSoC Design Using Application-Specific Architecturally Visible Communication
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Reducing Memory Constraints in Modulo Scheduling Synthesis for FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
The benefits of using variable-length pipelined operations in high-level synthesis
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
In this paper, we present a methodology for designing a pipeline of accelerators for an application. The application is modeled using sequential C language with simple stylizations. The synthesis of the accelerator pipeline involves designing loop accelerators for individual kernels, instantiating buffers for arrays used in the application, and hooking up these building blocks to form a pipeline. A compiler-based system automatically synthesizes loop accelerators for individual kernels at varying performance levels. An integer linear program formulation which simultaneously optimizes the cost of loop accelerators and the cost of memory buffers is proposed to compose the loop accelerators to form an accelerator pipeline for the whole application. Cases studies for some applications, including FMRadio and Beamformer, are presented to illustrate our design methodology. Experiments show significant cost savings are achieved through hardware sharing, while achieving the prescribed throughput requirements.