Cathedral-III: Architecture-driven high-level synthesis for high throughput DSP applications
DAC '91 Proceedings of the 28th ACM/IEEE Design Automation Conference
Scheduling using behavioral templates
DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
Exploiting regularity for low-power design
Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design
Instruction set definition and instruction selection for ASIPs
ISSS '94 Proceedings of the 7th international symposium on High-level synthesis
Fast module mapping and placement for datapaths in FPGAs
FPGA '98 Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays
The Generation of Optimal Code for Arithmetic Expressions
Journal of the ACM (JACM)
Optimal Code Generation for Expression Trees
Journal of the ACM (JACM)
Cross-level hierarchical high-level synthesis
Proceedings of the conference on Design, automation and test in Europe
Instruction generation for hybrid reconfigurable systems
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
A super-scheduler for embedded reconfigurable systems
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
CARDIS '98 Proceedings of the The International Conference on Smart Card Research and Applications
Totem: Custom Reconfigurable Array Generation
FCCM '01 Proceedings of the the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Performance optimization using template mapping for datapath-intensive high-level synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Processor Acceleration Through Automated Instruction Set Customization
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Implementation of a UMTS Turbo-Decoder on a Dynamically Reconfigurable Platform
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Area-efficient instruction set synthesis for reconfigurable system-on-chip designs
Proceedings of the 41st annual Design Automation Conference
International Journal of Parallel Programming - Special issue: Workshop on application specific processors (WASP)
INSIDE: INstruction Selection/Identification & Design Exploration for Extensible Processors
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Dual-pipeline heterogeneous ASIP design
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Scalable custom instructions identification for instruction-set extensible processors
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Efficient metrics and high-level synthesis for dynamically reconfigurable logic
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dataflow Mini-Graphs: Amplifying Superscalar Capacity and Bandwidth
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors
Proceedings of the 32nd annual international symposium on Computer Architecture
Automated Custom Instruction Generation for Domain-Specific Processor Acceleration
IEEE Transactions on Computers
Exploiting pipelining to relax register-file port constraints of instruction-set extensions
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Exploring the design space of LUT-based transparent accelerators
Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
A quantitative study and estimation models for extensible instructions in embedded processors
Proceedings of the 2004 IEEE/ACM International conference on Computer-aided design
Automating processor customisation: optimised memory access and resource sharing
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Customization of application specific heterogeneous multi-pipeline processors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Automatic selection of application-specific instruction-set extensions
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Application specific forwarding network and instruction encoding for multi-pipe ASIPs
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Scalable subgraph mapping for acyclic computation accelerators
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Automated framework for partitioning DSP applications in hybrid reconfigurable platforms
Microprocessors & Microsystems
Graph-Based Procedural Abstraction
Proceedings of the International Symposium on Code Generation and Optimization
Optimizing instruction-set extensible processors under data bandwidth constraints
Proceedings of the conference on Design, automation and test in Europe
An overview of reconfigurable hardware in embedded systems
EURASIP Journal on Embedded Systems
Rethinking custom ISE identification: a new processor-agnostic method
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Pattern-based behavior synthesis for FPGA resource reduction
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Fast, quasi-optimal, and pipelined instruction-set extensions
Proceedings of the 2008 Asia and South Pacific Design Automation Conference
The Instruction-Set Extension Problem: A Survey
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Proceedings of the 6th ACM conference on Computing frontiers
An Application Development Framework for ARISE Reconfigurable Processors
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Partitioning and scheduling of task graphs on partially dynamically reconfigurable FPGAs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
BURS-based instruction set selection
PSI'06 Proceedings of the 6th international Andrei Ershov memorial conference on Perspectives of systems informatics
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Fast, nearly optimal ISE identification with I/O serialization through maximal clique enumeration
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A generalized control-flow-aware pattern recognition algorithm for behavioral synthesis
Proceedings of the Conference on Design, Automation and Test in Europe
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
The Instruction-Set Extension Problem: A Survey
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Scientific Application Demands on a Reconfigurable Functional Unit Interface
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
The ARISE approach for extending embedded processors with arbitrary hardware accelerators
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
The Journal of Supercomputing
Hi-index | 0.00 |
The increasing demand for complex and specialized embedded hardware must be met by processors which are optimized for performance, yet are also extremely flexible. In our work, we explore the tradeoff between flexibility and performance in the domain of reconfigurable processor design. Specifically, we seek to identify regularly occurring, computation-heavy patterns in an application or set of applications. These patterns become candidates for hard-logic implementation, potentially embedded in the flexible reconfigurable fabric as special optimized instructions. In this work we present an extension to previous work in instruction generation: an algorithm that identifies parallel templates. We discuss the advantages of parallel templates, and prove the correctness of our algorithm. We introduce an All-Pairs Common Slack Graph (APCSG) as an effective tool for parallel template generation. Finally, we demonstrate the effectiveness of our algorithm on several applicationse dataflow graphs, reducing latency on average by 51.98%, without unreasonably increasing chip area.