Constructing the Procedure Call Multigraph
IEEE Transactions on Software Engineering
A formal model and specification language for procedure calling conventions
POPL '95 Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Specifying representations of machine instructions
ACM Transactions on Programming Languages and Systems (TOPLAS)
Advanced compiler design and implementation
Advanced compiler design and implementation
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit
Proceedings of the 27th annual international symposium on Computer architecture
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Synthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits
PACT HDL: a C compiler targeting ASICs and FPGAs with power and performance optimizations
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Binary Translation: Static, Dynamic, Retargetable?
ICSM '96 Proceedings of the 1996 International Conference on Software Maintenance
Hardware/software partitioning of software binaries
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Dynamic hardware/software partitioning: a first approach
Proceedings of the 40th annual Design Automation Conference
FCCM '03 Proceedings of the 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Evaluation of scheduling and allocation algorithms while mapping assembly code onto FPGAs
Proceedings of the 14th ACM Great Lakes symposium on VLSI
Automatic extraction of function bodies from software binaries
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
The New Jersey machine-code toolkit
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
SOMA: a tool for synthesizing and optimizing memory accesses in ASICs
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Automatic extraction of function bodies from software binaries
Proceedings of the 2005 Asia and South Pacific Design Automation Conference
New decompilation techniques for binary-level co-processor generation
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Proceedings of the 41st annual Design Automation Conference
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Automatic hardware implementation tool for a discrete Adaboost-based decision algorithm
EURASIP Journal on Applied Signal Processing
Thread warping: a framework for dynamic synthesis of thread accelerators
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Proceedings of the 2007 Summer Computer Simulation Conference
An overview of a compiler for mapping software binaries to hardware
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Thread Warping: Dynamic and Transparent Synthesis of Thread Accelerators
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Generation of control and data flow graphs from scheduled and pipelined assembly code
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
The introduction of advanced FPGA architectures, with built-in DSP support, has given DSP designers a new hardware alternative. By exploiting its inherent parallelism, it is expected that FPGAs can outperform DSP processors. This paper describes the process and considerations for automatically translating binaries targeted for general DSP processors into Register Transfer Level (RTL) VHDL or Verilog code to be mapped onto commercial FPGAs. The Texas Instruments C6000 DSP processor architecture is chosen as the DSP processor platform, and the Xilinx Virtex II as a target FPGA. Various optimizations are discussed, including data dependency analysis, procedure extraction, induction variable analysis, memory optimizations, and scheduling. Experimental results on resource usage and performance are shown for ten software binary benchmarks. Results show performance gains of 3-20X in the FPGA designs over that of the DSP processors in terms of reductions of execution cycles.