A low power hardware/software partitioning approach for core-based embedded systems
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
SystemC: a homogenous environment to test embedded systems
Proceedings of the ninth international symposium on Hardware/software codesign
Mapping a Single Assignment Programming Language to Reconfigurable Systems
The Journal of Supercomputing
Hardware/software partitioning of software binaries
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Dynamic hardware/software partitioning: a first approach
Proceedings of the 40th annual Design Automation Conference
Overview of the FREEDOM Compiler for Mapping DSP Software to FPGAs
FCCM '04 Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A Decompilation Approach to Partitioning Software for Microprocessor/FPGA Platforms
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Proceedings of the 41st annual Design Automation Conference
Proceedings of the 17th ACM Great Lakes symposium on VLSI
The Journal of Supercomputing
A code refinement methodology for performance-improved synthesis from C
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Speedups in embedded systems with a high-performance coprocessor datapath
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Low-power warp processor for power efficient high-performance embedded systems
Proceedings of the conference on Design, automation and test in Europe
Clock-frequency assignment for multiple clock domain systems-on-a-chip
Proceedings of the conference on Design, automation and test in Europe
C is for circuits: capturing FPGA circuits as sequential code for portability
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Proceedings of the 2007 Summer Computer Simulation Conference
Hardware/software partitioning with multi-version implementation exploration
Proceedings of the 18th ACM Great Lakes symposium on VLSI
Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Scalability and parallel execution of warp processing: dynamic hardware/software partitioning
International Journal of Parallel Programming
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Thread Warping: Dynamic and Transparent Synthesis of Thread Accelerators
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Practical design space exploration of an h264 decoder for handheld devices using a virtual platform
PATMOS'09 Proceedings of the 19th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Proceedings of the great lakes symposium on VLSI
Hi-index | 0.00 |
We describe results of a case study whose intent was to determine whether new techniques for hardware/software partitioning of an application's binary are competitive with partitioning at the C source code level. While such competitiveness has been shown previously for standard benchmark suites involving smaller or unoptimized applications, the case study instead focuses on a complete 16,000-line highly-optimized commercial-grade application, namely an H.264 video decoder. The several month study revealed that binary partitioning was indeed competitive, achieving nearly identical 2.5x speedups as source level partitioning, compared to a standard microprocessor. Furthermore, the study revealed that several simple C-level coding modifications, including pass by value-return, function specialization, algorithmic specialization, hardware-targeted reimplementation, global array elimination, hoisting and sinking of error code, and conversion to explicit control flow, could lead to improved application speedups approaching 7x for both source level and binary level partitioning.