A hardware/software partitioner using a dynamically determined granularity
DAC '97 Proceedings of the 34th annual Design Automation Conference
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Energy-conscious HW/SW-partitioning of embedded systems: a case study on an MPEG-2 encoder
Proceedings of the 6th international workshop on Hardware/software codesign
A low power hardware/software partitioning approach for core-based embedded systems
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
A low power unified cache architecture providing power and performance flexibility (poster session)
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Hardware-Software Cosynthesis for Microcontrollers
IEEE Design & Test
Hardware/software partitioning of software binaries
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Dynamic hardware/software partitioning: a first approach
Proceedings of the 40th annual Design Automation Conference
Proceedings of the 40th annual Design Automation Conference
Partitioning and Exploration Strategies in the TOSCA Co-Design Flow
CODES '96 Proceedings of the 4th International Workshop on Hardware/Software Co-Design
Frequent loop detection using efficient non-intrusive on-chip hardware
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
A Configurable Logic Architecture for Dynamic Hardware/Software Partitioning
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Dynamic FPGA routing for just-in-time FPGA compilation
Proceedings of the 41st annual Design Automation Conference
Programming models and architectures for FPGA platforms
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
A Study of the Scalability of On-Chip Routing for Just-in-Time FPGA Compilation
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Proceedings of the 41st annual Design Automation Conference
A code refinement methodology for performance-improved synthesis from C
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Resource-constrained multiprocessor synthesis for floating-point applications on FPGAs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Exploring online synthesis for CGRAs with specialized operator sets
International Journal of Reconfigurable Computing - Special issue on selected papers from the international workshop on reconfigurable communication-centric systems on chips (ReCoSoC' 2010)
Architecture for transparent binary acceleration of loops with memory accesses
ARC'13 Proceedings of the 9th international conference on Reconfigurable Computing: architectures, tools, and applications
Hybrid interconnect design for heterogeneous hardware accelerators
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
While soft processor cores provided by FPGA vendors offer designers with increased flexibility, such processors typically incur penalties in performance and energy consumption compared to hard processor core alternatives. The recently developed technology of warp processing can help reduce those penalties. Warp processing is the dynamic and transparent transformation of critical software regions from microprocessor execution to much faster circuit execution on an FPGA. In this article, we describe an implementation of a warp processor on a Xilinx Virtex-II Pro and Spartan3 FPGAs incorporating one or more MicroBlaze soft processor cores. We further provide a detailed analysis of the energy overhead of dynamically partitioning an application's kernels to hardware executing within an FPGA. Considering an implementation that periodically partitions the executing application once every minute, a MicroBlaze-based warp processor implemented on a Spartan3 FPGA achieves average speedups of 5.8× and energy reductions of 49% compared to the MicroBlaze soft processor core alone—providing competitive performance and energy consumption compared to existing hard processor cores.