PathFinder: a negotiation-based performance-driven router for FPGAs
FPGA '95 Proceedings of the 1995 ACM third international symposium on Field-programmable gate arrays
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
IEEE Transactions on Computers
The Garp Architecture and C Compiler
Computer
IEEE Transactions on Computers
Automatic application-specific instruction-set extensions under microarchitectural constraints
Proceedings of the 40th annual Design Automation Conference
Implementing Multiply-Accumulate Operation in Multiplication Time
ARITH '97 Proceedings of the 13th Symposium on Computer Arithmetic (ARITH '97)
The chimaera reconfigurable functional unit
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Characterizing embedded applications for instruction-set extensible processors
Proceedings of the 41st annual Design Automation Conference
Proceedings of the 31st annual international symposium on Computer architecture
Scalable custom instructions identification for instruction-set extensible processors
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Dynamic Strands: Collapsing Speculative Dependence Chains for Reducing Pipeline Communication
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Dataflow Mini-Graphs: Amplifying Superscalar Capacity and Bandwidth
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
Static strands: safely collapsing dependence chains for increasing embedded power efficiency
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors
Proceedings of the 32nd annual international symposium on Computer Architecture
An integer linear programming approach for identifying instruction-set extensions
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Proceedings of the 41st annual Design Automation Conference
The HPC Challenge (HPCC) benchmark suite
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Serialization-Aware Mini-Graphs: Performance with Fewer Resources
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
RISPP: rotating instruction set processing platform
Proceedings of the 44th annual Design Automation Conference
Compiling custom instructions onto expression-grained reconfigurable architectures
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Fast custom instruction identification by convex subgraph enumeration
ASAP '08 Proceedings of the 2008 International Conference on Application-Specific Systems, Architectures and Processors
Design and Architectural Exploration of Expression-Grained Reconfigurable Arrays
SASP '08 Proceedings of the 2008 Symposium on Application Specific Processors
CGRA express: accelerating execution using dynamic operation fusion
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
KAHRISMA: a novel hypermorphic reconfigurable-instruction-set multi-grained-array architecture
Proceedings of the Conference on Design, Automation and Test in Europe
Dynamically Specialized Datapaths for energy efficient computing
HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
Exact and approximate algorithms for the extension of embedded processor instruction sets
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Selective flexibility: breaking the rigidity of datapath merging
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
A traditional extensible processor with customized circuits achieves high performance at the cost of flexibility, while a dynamically extensible processor with reconfigurable fabric offers flexibility for instruction-set extensions (ISEs) but suffers from computational inefficiency. We introduce a novel architecture called Just-in-Time Customizable (JiTC) processor that reconciles the conflicting demands of performance and flexibility in extensible processors. Our key innovation is a multi-stage accelerator, called Specialized Functional Unit (SFU), that is tightly integrated in the processor pipeline. The SFU design is derived through a systematic study of a large range of representative embedded applications. The SFU can be reconfigured on per-cycle basis to support different application-specific instructions at near-ideal performance of an extensible processor. We also provide an automated compilation tool chain for JiTC processor. The experimental results confirm the efficiency and applicability of our approach.