SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Image and Video Compression Standards: Algorithms and Architectures
Image and Video Compression Standards: Algorithms and Architectures
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Compiling Application-Specific Hardware
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Stream-Oriented FPGA Computing in the Streams-C High Level Language
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
SPARK: A High-Lev l Synthesis Framework For Applying Parallelizing Compiler Transformations
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
Convex Optimization
The Challenges of Hardware Synthesis from C-Like Languages
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Optimum and heuristic synthesis of multiple word-length architectures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Automatic On-chip Memory Minimization for Data Reuse
FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
This paper describes our approaches to raise the level of abstraction at which hardware suitable for accelerating computationally-intensive applications can be specified. Field-Programmable Gate Arrays (FPGAs) are becoming adopted as a computational platform by the high-performance computing community, but there are challenges to extract maximum performance from these devices. Unlike other approaches, our focus is on data memory organisation and input-output bandwidth considerations, which are the typical stumbling block of existing hardware compilation schemes. We describe our approaches, which are based on formal optimization techniques, and present some results showing the advantage of exposing the interaction between data memory system design and parallelism extraction to the compiler.