Data reorganization engines for the next generation of system-on-a-chip FPGAs
FPGA '02 Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
Code Generation in the Polyhedral Model Is Easier Than You Think
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
A practical automatic polyhedral parallelizer and locality optimizer
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Accelerating Compute-Intensive Applications with GPUs and FPGAs
SASP '08 Proceedings of the 2008 Symposium on Application Specific Processors
High-Performance Heterogeneous Computing with the Convey HC-1
Computing in Science and Engineering
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Leap scratchpads: automatic memory and cache management for reconfigurable logic
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
LegUp: high-level synthesis for FPGA-based processor/accelerator systems
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
CoRAM: an in-fabric memory architecture for FPGA-based computing
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Combined loop transformation and hierarchy allocation for data reuse optimization
Proceedings of the International Conference on Computer-Aided Design
CONNECT: re-examining conventional wisdom for designing nocs in the context of FPGAs
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Prototype and evaluation of the CoRAM memory architecture for FPGA-based computing
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Optimizing SDRAM bandwidth for custom FPGA loop accelerators
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
VirtualRC: a virtual FPGA platform for applications and tools portability
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
FPGA-specific synthesis of loop-nests with pipelined computational cores
Microprocessors & Microsystems
Hi-index | 0.00 |
This paper presents initial work on developing a C compiler for the CoRAM FPGA computing abstraction. The presented effort focuses on compiling fixed-bound perfect loop nests that operate on large data sets in external DRAM. As required by the CoRAM abstraction, the compiler partitions source code into two separate implementation components: (1) hardware kernel pipelines to be mapped onto the reconfigurable logic fabric; and (2) control threads that express, in a C-like language, the sequencing and coordination of data transfers between the hardware kernels and external DRAM. The compiler performs optimizations to increase parallelism and use DRAM bandwidth efficiently. It can target different FPGA platforms that support the CoRAM abstraction, either natively in a future FPGA or in soft-logic on today's devices. The CoRAM abstraction provides a convenient high-level compilation target to simplify the task of design optimization and system generation. The compiler is evaluated using three test programs (matrix-matrix multiplication, k-nearest neighbor, and 2D convolution) on the Xilinx ML605 and the Altera DE4. Results show that our compiler is able to target the different platforms and effectively exploit their dissimilar capacities and features. Depending on the application, the compiler-generated implementations achieve performance ranging from a factor of 4 slower to a factor of 2 faster relative to hand-designed implementations, as measured on actual hardware.