Evaluation of the streams-C C-to-FPGA compiler: an applications perspective
FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
Profiling tools for hardware/software partitioning of embedded applications
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Stream-Oriented FPGA Computing in the Streams-C High Level Language
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Fast Area Estimation to Support Compiler Optimizations in FPGA-Based Reconfigurable Systems
FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A quantitative analysis of the speedup factors of FPGAs over processors
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
An FPGA implementation of the two-dimensional finite-difference time-domain (FDTD) algorithm
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
A compiled accelerator for biological cell signaling simulations
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Input data reuse in compiling window operations onto reconfigurable hardware
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Proceedings of the 41st annual Design Automation Conference
Partitioning Methodology for Heterogeneous Reconfigurable Functional Units
The Journal of Supercomputing
Low-power warp processor for power efficient high-performance embedded systems
Proceedings of the conference on Design, automation and test in Europe
CHiMPS: a high-level compilation flow for hybrid CPU-FPGA architectures
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Efficient hardware code generation for FPGAs
ACM Transactions on Architecture and Code Optimization (TACO)
Optimal Unroll Factor for Reconfigurable Architectures
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Accelerating Speculative Execution in High-Level Synthesis with Cancel Tokens
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary
ECOOP '08 Proceedings of the 22nd European conference on Object-Oriented Programming
Non-intrusive dynamic application profiler for detailed loop execution characterization
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Optimus: efficient realization of streaming applications on FPGAs
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Outer loop pipelining for application specific datapaths in FPGAs
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Scalability and parallel execution of warp processing: dynamic hardware/software partitioning
International Journal of Parallel Programming
A compiler intermediate representation for reconfigurable fabrics
International Journal of Parallel Programming
Performance and power of cache-based reconfigurable computing
Proceedings of the 36th annual international symposium on Computer architecture
Optimal Loop Unrolling and Shifting for Reconfigurable Architectures
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Optimized generation of memory structure in compiling window operations onto reconfigurable hardware
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
Automated synthesis of streaming C applications to process networks in hardware
Proceedings of the Conference on Design, Automation and Test in Europe
Impact of high-level transformations within the ROCCC framework
ACM Transactions on Architecture and Code Optimization (TACO)
Efficient hardware-based nonintrusive dynamic application profiling
ACM Transactions on Embedded Computing Systems (TECS)
Constructing application-specific memory hierarchies on FPGAs
Transactions on high-performance embedded architectures and compilers III
Microprocessors & Microsystems
Scalable communication architectures for massively parallel hardware multi-processors
Journal of Parallel and Distributed Computing
Using memory profile analysis for automatic synthesis of pointers code
ACM Transactions on Embedded Computing Systems (TECS)
ACM Transactions on Embedded Computing Systems (TECS)
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Accelerating radiation dose calculation: A multi-FPGA solution
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on ESTIMedia'10
SWSL: software synthesis for network lookup
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Design of massively parallel hardware multi-processors for highly-demanding embedded applications
Microprocessors & Microsystems
Hi-index | 0.00 |
FPGAs, as computing devices, offer significant speedup over microprocessors. Furthermore, their configurability offers an advantage over traditional ASICs. However, they do not yet enjoy high-level language programmability, as microprocessors do. This has become the main obstacle for their wider acceptance by application designers. ROCCC is a compiler designed to generate circuits from C source code to execute on FPGAs, more specifically on CSoCs. It generates RTL level HDLs from frequently executing kernels in an application. In this paper, we describe ROCCC's system overview and focus on its data path generation. We compare the performance of ROCCC-generated VHDL code with that of Xilinx IPs. The synthesis result shows that ROCCC-generated circuit takes around 2x ~ 3x area and runs at comparable clock rate.