Automatic application specific floating-point unit generation
Proceedings of the conference on Design, automation and test in Europe
A design flow dedicated to multi-mode architectures for DSP applications
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Rapid application specific floating-point unit generation with bit-alignment
Proceedings of the 45th annual Design Automation Conference
A New Datapath Merging Method for Reconfigurable System
ARC '09 Proceedings of the 5th International Workshop on Reconfigurable Computing: Architectures, Tools and Applications
Run-time management of custom instructions on a partially reconfigurable architecture
International Journal of Information and Communication Technology
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Custom floating-point unit generation for embedded systems
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Design-space exploration of resource-sharing solutions for custom instruction set extensions
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
New reconfigurable architectures for implementing FIR filters with low complexity
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Selecting profitable custom instructions for reconfigurable processors
Journal of Systems Architecture: the EUROMICRO Journal
Maximum edge matching for reconfigurable computing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Physically-aware exploitation of component reuse in a partially reconfigurable architecture
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
High-level synthesis for designing multimode architectures
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Reconfigurable Architecture for Deinterlacer based on Algorithm/Architecture Co-Design
Journal of Signal Processing Systems
High performance and area efficient flexible DSP datapath synthesis
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Design of multi-mode application-specific cores based on high-level synthesis
Integration, the VLSI Journal
Efficient datapath merging for the overhead reduction of run-time reconfigurable systems
The Journal of Supercomputing
A modified merging approach for datapath configuration time reduction
ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
A cost model for partial dynamic reconfiguration
Transactions on High-Performance Embedded Architectures and Compilers IV
On the asymptotic costs of multiplexer-based reconfigurability
Proceedings of the 49th Annual Design Automation Conference
ACM Transactions on Design Automation of Electronic Systems (TODAES)
QUKU: A dual-layer reconfigurable architecture
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
Reconfigurable pipelined coprocessor for multi-mode communication transmission
Proceedings of the 50th Annual Design Automation Conference
Predicting best design trade-offs: a case study in processor customization
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hardware reuse in modern application-specific processors and accelerators
Microprocessors & Microsystems
An analytical method for reliability aware instruction set extension
The Journal of Supercomputing
Hi-index | 0.03 |
Reconfigurable systems have been shown to achieve significant performance speedup through architectures that map the most time-consuming application kernel modules or inner loops to a reconfigurable datapath. As each portion of the application starts to execute, the system partially reconfigures the datapath so as to perform the corresponding computation. The reconfigurable datapath should have as few and simple hardware blocks and interconnections as possible, in order to reduce its cost, area, and reconfiguration overhead. To achieve that, hardware blocks and interconnections should be reused as much as possible across the application. We represent each piece of the application as a data-flow graph (DFG). The DFG merging process identifies similarities among the DFGs, and produces a single datapath that can be dynamically reconfigured and has a minimum area cost, when considering both hardware blocks and interconnections. In this paper we present a novel technique for the DFG merge problem, and we evaluate it using programs from the MediaBench benchmark. Our algorithm execution time approaches the fastest previous solution to this problem and produces datapaths with an average area reduction of 20%. When compared to the best known area solution, our approach produces datapaths with area costs equivalent to (and in many cases better than) it, while achieving impressive speedups.