A hardware/software partitioner using a dynamically determined granularity
DAC '97 Proceedings of the 34th annual Design Automation Conference
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Energy-conscious HW/SW-partitioning of embedded systems: a case study on an MPEG-2 encoder
Proceedings of the 6th international workshop on Hardware/software codesign
A low power hardware/software partitioning approach for core-based embedded systems
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
The design of an SRAM-based field-programmable gate array—part I: architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Speed and area tradeoffs in cluster-based FPGA architectures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Design and Implementation of the MorphoSys Reconfigurable ComputingProcessor
Journal of VLSI Signal Processing Systems - Special issue on VLSI on custom computing technology
Dynamo: a transparent dynamic optimization system
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Hardware-Software Cosynthesis for Microcontrollers
IEEE Design & Test
Dynamic Reconfiguration in Mobile Systems
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Hardware/software partitioning of software binaries
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Dynamic hardware/software partitioning: a first approach
Proceedings of the 40th annual Design Automation Conference
Proceedings of the 40th annual Design Automation Conference
Partitioning and Exploration Strategies in the TOSCA Co-Design Flow
CODES '96 Proceedings of the 4th International Workshop on Hardware/Software Co-Design
The Chimaera reconfigurable functional unit
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
Garp: a MIPS processor with a reconfigurable coprocessor
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
A dynamic instruction set computer
FCCM '95 Proceedings of the IEEE Symposium on FPGA's for Custom Computing Machines
A codesigned on-chip logic minimizer
Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dynamic FPGA routing for just-in-time FPGA compilation
Proceedings of the 41st annual Design Automation Conference
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Distributed HW/SW-Partitioning for Embedded Reconfigurable Networks
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Proceedings of the 42nd annual Design Automation Conference
Exploiting Java through binary translation for low power embedded reconfigurable systems
SBCCI '05 Proceedings of the 18th annual symposium on Integrated circuits and system design
Profiling soft-core processor applications for hardware/software partitioning
Journal of Systems Architecture: the EUROMICRO Journal
Proceedings of the 41st annual Design Automation Conference
Dynamic task binding for hardware/software reconfigurable networks
SBCCI '06 Proceedings of the 19th annual symposium on Integrated circuits and systems design
Modeling and design of fault-tolerant and self-adaptive reconfigurable networked embedded systems
EURASIP Journal on Embedded Systems
An overview of reconfigurable hardware in embedded systems
EURASIP Journal on Embedded Systems
Dynamic and On-Line Design Space Exploration for Reconfigurable Architectures
Transactions on High-Performance Embedded Architectures and Compilers I
Design and implementation of a MicroBlaze-based warp processor
ACM Transactions on Embedded Computing Systems (TECS)
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Slotless module-based reconfiguration of embedded FPGAs
ACM Transactions on Embedded Computing Systems (TECS)
A systematic approach to profiling for hardware/software partitioning
Computers and Electrical Engineering
Concept-based partitioning for large multidomain multifunctional embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Hardware JIT compilation for off-the-shelf dynamically reconfigurable FPGAs
CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
An operating system infrastructure for fault-tolerant reconfigurable networks
ARCS'06 Proceedings of the 19th international conference on Architecture of Computing Systems
Web-Based cooperative design for soc and improved architecture exploration algorithm
APWeb'06 Proceedings of the 2006 international conference on Advanced Web and Network Technologies, and Applications
Parallel partitioning for distributed systems using sequential assignment
Journal of Parallel and Distributed Computing
Energy-aware fault-tolerant network-on-chips for addressing multiple traffic classes
Microprocessors & Microsystems
Hi-index | 0.00 |
In previous work, we showed the benefits and feasibility of having a processor dynamically partition its executing software such that critical software kernels are transparently partitioned to execute as a hardware coprocessor on configurable logic -- an approach we call warp processing. The configurable logic place and route step is the most computationally intensive part of such hardware/software partitioning, normally running for many minutes or hours on powerful desktop processors. In contrast, dynamic partitioning requires place and route to execute in just seconds and on a lean embedded processor. We have thereforedesigned a configurable logic architecture specifically for dynamic hardware/software partitioning. Through experiments with popular benchmarks, we show that by specifically focusing on the goal of software kernel speedup when designing the FPGA architecture, rather than on the more general goal of ASIC prototyping, we can perform place and route for our architecture 50 times faster, using 10,000 times less data memory, and 1,000 times less code memory, than popular commercial tools mapping to commercial configurable logic. Yet, we show that we obtain speedups (2x on average, and as much as 4x) and energy savings(33% on average, and up to 74%) when partitioning even just one loop, which are comparable to commercial tools and fabrics. Thus, our configurable logic architecture represents a goodcandidate for platforms that will support dynamic hardware/software partitioning, and enables ultra-fast desktop tools for hardware/software partitioning, and even for fast configurable logic design in general.