Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
A technique for summarizing data access and its use in parallelism enhancing transformations
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Preliminary experiences with the Fortran D compiler
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Scalar replacement in the presence of conditional control flow
Software—Practice & Experience
Compiler optimizations for eliminating barrier synchronization
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Parallelizing compiler techniques based on linear inequalities
Parallelizing compiler techniques based on linear inequalities
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Adapting software pipelining for reconfigurable computing
CASES '00 Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems
Combined instruction and loop parallelism in array synthesis for FPGAs
Proceedings of the 14th international symposium on Systems synthesis
A compiler approach to fast hardware design space exploration in FPGA-based systems
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
Using estimates from behavioral synthesis tools in compiler-directed design space exploration
Proceedings of the 40th annual Design Automation Conference
Compiler-generated communication for pipelined FPGA applications
Proceedings of the 40th annual Design Automation Conference
Specifying and Compiling Applications for RaPiD
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
Pipeline Vectorization for Reconfigurable Systems
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Parallelizing Applications into Silicon
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Coarse-Grain Pipelining on Multiple FPGA Architectures
FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Custom Data Layout for Memory Parallelism
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
An efficient design space exploration for balance between computation and memory
An efficient design space exploration for balance between computation and memory
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Compiler Support for Exploiting Coarse-Grained Pipelined Parallelism
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Bridging the gap between compilation and synthesis in the DEFACTO system
LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
Mapping streaming architectures on reconfigurable platforms
ACM SIGARCH Computer Architecture News - Special issue on the 2006 reconfigurable and adaptive architecture workshop
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
A computing origami: folding streams in FPGAs
Proceedings of the 46th Annual Design Automation Conference
Optimized generation of memory structure in compiling window operations onto reconfigurable hardware
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
Model-based synthesis and optimization of static multi-rate image processing algorithms
Proceedings of the Conference on Design, Automation and Test in Europe
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A taxonomy of accelerator architectures and their programming models
IBM Journal of Research and Development
Integrating profile-driven parallelism detection and machine-learning-based mapping
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
This paper presents a set of measurements which characterize the design space for automatically mapping high-level algorithms consisting of multiple loop nests, expressed in C, onto an FPGA. We extend a prior compiler algorithm that derived optimized FPGA implementations for individual loop nests. We focus on the space-time tradeoffs associated with sharing constrained chip area among multiple computations represented by an asynchronous pipeline. Intermediate results are communicated on chip; communication analysis generates this communication automatically. Other analyses and transformations, also associated with parallelizing compiler technology, are used to perform high-level optimization of the designs. We vary the amount of parallelism in individual loop nests with the goal of deriving an overall design that makes the most effective use of chip resources. We describe several heuristics for automatically searching the space and a set of metrics for evaluating and comparing designs. From results obtained through an automated process, we demonstrate that heuristics derived through sophisticated compiler analysis are the most effective at navigating this complex search space, particularly for more complex applications.