Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
Kernel scheduling in reconfigurable computing
DATE '99 Proceedings of the conference on Design, automation and test in Europe
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Hardware-software co-design of embedded reconfigurable architectures
Proceedings of the 37th Annual Design Automation Conference
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit
Proceedings of the 27th annual international symposium on Computer architecture
IEEE Transactions on Computers
Design and Implementation of the MorphoSys Reconfigurable ComputingProcessor
Journal of VLSI Signal Processing Systems - Special issue on VLSI on custom computing technology
Math toolkit for real-time programming
Math toolkit for real-time programming
Evaluation of the streams-C C-to-FPGA compiler: an applications perspective
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
Loop Pipelining and Optimization for Run Time Reconfiguration
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Instruction-Level Parallelism for Reconfigurable Computing
FPL '98 Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications, From FPGAs to Computing Paradigm
RaPiD - Reconfigurable Pipelined Datapath
FPL '96 Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers
Cameron: High Level Language Compilation for Reconfigurable Systems
PACT '99 Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques
Physical resource binding for a Coarse-Grain reconfigurable array using evolutionary algorithms
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Bio Molecular Engine: a bio-inspired environment for models of growing and evolvable computation
GECCO '05 Proceedings of the 7th annual workshop on Genetic and evolutionary computation
Memory centric thread synchronization on platform FPGAs
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Optimizing compiler for shared-memory multiple SIMD architecture
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
The Journal of Supercomputing
Journal of Systems Architecture: the EUROMICRO Journal
A holistic approach for tightly coupled reconfigurable parallel processors
Microprocessors & Microsystems
Resource aware mapping on coarse grained reconfigurable arrays
Microprocessors & Microsystems
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Data pipeline optimization for shared memory multiple-SIMD architecture
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Performance improvements of microprocessor platforms with a coarse-grained reconfigurable data-path
ICS'06 Proceedings of the 10th WSEAS international conference on Systems
Journal of Real-Time Image Processing
Hi-index | 0.00 |
The rapid growth of device densities on silicon has made it feasible to deploy reconfigurable hardware as a highly parallel computing platform. However, one of the obstacles to the wider acceptance of this technology is its programmability. The application needs to be programmed in hardware description languages or an assembly equivalent, whereas most application programmers are used to the algorithmic programming paradigm. SA-C has been proposed as an expression-oriented language designed to implicitly express data parallel operations. The Morphosys project proposes an SoC architecture consisting of reconfigurable hardware that supports a data-parallel, SIMD computational model. This paper describes a compiler framework to analyze SA-C programs, perform optimizations, and automatically map the application onto the Morphosys architecture. The mapping process is static and it involves operation scheduling, processor allocation and binding, and register allocation in the context of the Morphosys architecture. The compiler also handles issues concerning data streaming and caching in order to minimize data transfer overhead. We have compiled some important image-processing kernels, and the generated schedules reflect an average speedup in execution times of up to 6× compared to the execution on 800 MHz Pentium III machines.