Combinatorial algorithms for integrated circuit layout
Combinatorial algorithms for integrated circuit layout
Hardware/software partitioning for multi-function systems
ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A low power hardware/software partitioning approach for core-based embedded systems
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Reconfigurable computing: a survey of systems and software
ACM Computing Surveys (CSUR)
Hardware-Software Cosynthesis for Digital Systems
IEEE Design & Test
Profiling tools for hardware/software partitioning of embedded applications
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Garp: a MIPS processor with a reconfigurable coprocessor
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
The NAPA Adaptive Processing Architecture
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
A Quantitative Analysis of Reconfigurable Coprocessors for Multimedia Applications
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
PCI-PipeRench and the SWORDAPI: A System for Stream-Based Reconfigurable Computing
FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A quantitative analysis of the speedup factors of FPGAs over processors
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
ACM Transactions on Embedded Computing Systems (TECS)
A Partitioning Methodology for Accelerating Applications in Hybrid Reconfigurable Platforms
Proceedings of the conference on Design, Automation and Test in Europe - Volume 3
Application-specific customization of soft processor microarchitecture
Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
GALDS: a complete framework for designing multiclock ASICs and socs
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
The integration of microprocessors and field-programmable gate array (FPGA) fabric on a single chip increases both the utility and necessity of tools that automatically move software functions from the microprocessor to accelerators on the FPGA to improve performance or energy. Such hardware/software partitioning for modern FPGAs involves the problem of partitioning functions among two levels of accelerator groups -- tightly-coupled accelerators that have fast single-clock-cycle memory access to the microprocessor's memory, and loosely-coupled accelerators that access memory through a bridge to avoid slowing the main clock period with their longer critical paths. We introduce this new two-level accelerator-partitioning problem, and we describe a novel optimal dynamic programming algorithm to solve the problem. By making use of the size constraint imposed by FPGAs, the algorithm has what is effectively quadratic runtime complexity, running in just a few seconds for examples with up to 25 accelerators, obtaining an average performance improvement of 35% compared to a traditional single-level bus architecture.