A compiler approach to fast hardware design space exploration in FPGA-based systems
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Compiler-generated communication for pipelined FPGA applications
Proceedings of the 40th annual Design Automation Conference
Exploiting Program Branch Probabilities in Hardware Compilation
IEEE Transactions on Computers
Evaluating heuristics in automatically mapping multi-loop applications to FPGAs
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Automatically partitioning packet processing applications for pipelined architectures
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Resource sharing in pipelined CDFG synthesis
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
C is for circuits: capturing FPGA circuits as sequential code for portability
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Compiling for reconfigurable computing: A survey
ACM Computing Surveys (CSUR)
Code transformations for embedded reconfigurable computing architectures
GTTSE'09 Proceedings of the 3rd international summer school conference on Generative and transformational techniques in software engineering III
An FPGA-based multi-core approach for pipelining computing stages
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Hi-index | 0.00 |
Reconfigurable systems, and in particular, FPGA-based custom computing machines, offer a unique opportunity to define application-specific architectures. These architectures offer performance advantages for application domains such as image processing, where the use of customized pipelines exploits the inherent coarse-grain parallelism. In this paper we describe a set of program analyses and an implementation that map a sequential and un-annotated C program into a pipelined implementation running on a set of FPGAs, each with multiple external memories. Based on well-known parallel computing analysis techniques, our algorithms perform unrolling for operator parallelization, reuse and data layout for memory parallelization and precise communication analysis. We extend these techniques for FPGA-based systems to automatically partition the application data and computation into custom pipeline stages, taking into account the available FPGA and interconnect resources. We illustrate the analysis components by way of an example, a machine vision program. We present the algorithmresults, derived with minimal manual intervention, which demonstrate the potential of this approach for automatically deriving pipelined designs from high-level sequential specifications.