Optimized Generation of Data-Path from C Codes for FPGAs

Authors:
Zhi Guo;Betul Buyukkurt;Walid Najjar;Kees Vissers
Affiliations:
University of California Riverside;University of California Riverside;University of California Riverside;Xilinx Corp.
Venue:
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Year:
2005

Citing 9
Cited 33

Evaluation of the streams-C C-to-FPGA compiler: an applications perspective

FPGA '01 Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays
Profiling tools for hardware/software partitioning of embedded applications

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Stream-Oriented FPGA Computing in the Streams-C High Level Language

FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Fast Area Estimation to Support Compiler Optimizations in FPGA-Based Reconfigurable Systems

FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
High-Level Language Abstraction for Reconfigurable Computing

Computer
A quantitative analysis of the speedup factors of FPGAs over processors

FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
An FPGA implementation of the two-dimensional finite-difference time-domain (FDTD) algorithm

FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
A compiled accelerator for biological cell signaling simulations

FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Input data reuse in compiling window operations onto reconfigurable hardware

Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems

Warp Processors

Proceedings of the 41st annual Design Automation Conference
Partitioning Methodology for Heterogeneous Reconfigurable Functional Units

The Journal of Supercomputing
Low-power warp processor for power efficient high-performance embedded systems

Proceedings of the conference on Design, automation and test in Europe
CHiMPS: a high-level compilation flow for hybrid CPU-FPGA architectures

Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Efficient hardware code generation for FPGAs

ACM Transactions on Architecture and Code Optimization (TACO)
Optimal Unroll Factor for Reconfigurable Architectures

ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Accelerating Speculative Execution in High-Level Synthesis with Cancel Tokens

ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary

ECOOP '08 Proceedings of the 22nd European conference on Object-Oriented Programming
Non-intrusive dynamic application profiler for detailed loop execution characterization

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Optimus: efficient realization of streaming applications on FPGAs

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Hardware/software partitioning of floating point software applications to fixed-pointed coprocessor circuits

CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Outer loop pipelining for application specific datapaths in FPGAs

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Scalability and parallel execution of warp processing: dynamic hardware/software partitioning

International Journal of Parallel Programming
A compiler intermediate representation for reconfigurable fabrics

International Journal of Parallel Programming
Performance and power of cache-based reconfigurable computing

Proceedings of the 36th annual international symposium on Computer architecture
Optimal Loop Unrolling and Shifting for Reconfigurable Architectures

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Autonomous hardware/software partitioning and voltage/frequency scaling for low-power embedded systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Modern development methods and tools for embedded reconfigurable systems: A survey

Integration, the VLSI Journal
Optimized generation of memory structure in compiling window operations onto reconfigurable hardware

ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
Automated synthesis of streaming C applications to process networks in hardware

Proceedings of the Conference on Design, Automation and Test in Europe
Impact of high-level transformations within the ROCCC framework

ACM Transactions on Architecture and Code Optimization (TACO)
Efficient hardware-based nonintrusive dynamic application profiling

ACM Transactions on Embedded Computing Systems (TECS)
Constructing application-specific memory hierarchies on FPGAs

Transactions on high-performance embedded architectures and compilers III
Applying frame layout to hardware design in FPGA for seamless support of cross calls in CPU-FPGA coupling architecture

Microprocessors & Microsystems
Communication and memory architecture design of application-specific high-end multiprocessors

VLSI Design
Scalable communication architectures for massively parallel hardware multi-processors

Journal of Parallel and Distributed Computing
Using memory profile analysis for automatic synthesis of pointers code

ACM Transactions on Embedded Computing Systems (TECS)
Profiling and online system-level performance and power estimation for dynamically adaptable embedded systems

ACM Transactions on Embedded Computing Systems (TECS)
Real-time computation of local neighborhood functions in application-specific instruction-set processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Accelerating radiation dose calculation: A multi-FPGA solution

ACM Transactions on Embedded Computing Systems (TECS) - Special Section on ESTIMedia'10
SWSL: software synthesis for network lookup

ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Design of massively parallel hardware multi-processors for highly-demanding embedded applications

Microprocessors & Microsystems
Processor architecture exploration and synthesis of massively parallel multi-processor accelerators in application to LDPC decoding

Microprocessors & Microsystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

FPGAs, as computing devices, offer significant speedup over microprocessors. Furthermore, their configurability offers an advantage over traditional ASICs. However, they do not yet enjoy high-level language programmability, as microprocessors do. This has become the main obstacle for their wider acceptance by application designers. ROCCC is a compiler designed to generate circuits from C source code to execute on FPGAs, more specifically on CSoCs. It generates RTL level HDLs from frequently executing kernels in an application. In this paper, we describe ROCCC's system overview and focus on its data path generation. We compare the performance of ROCCC-generated VHDL code with that of Xilinx IPs. The synthesis result shows that ROCCC-generated circuit takes around 2x ~ 3x area and runs at comparable clock rate.