An energy-efficient method of supporting flexible special instructions in an embedded processor with compact ISA

Authors:
Dongrui She;Yifan He;Henk Corporaal
Affiliations:
Eindhoven University of Technology, Eindhoven, the Netherlands;Eindhoven University of Technology, Eindhoven, the Netherlands;Eindhoven University of Technology, Eindhoven, the Netherlands
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2008

Citing 37
Cited 0

Code scheduling and register allocation in large basic blocks

ICS '88 Proceedings of the 2nd international conference on Supercomputing
A high-performance microarchitecture with hardware-programmable functional units

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Greed is good: approximating independent sets in sparse and bounded-degree graphs

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Automatic detection of recurring operation patterns

CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
Automatic application-specific instruction-set extensions under microarchitectural constraints

Proceedings of the 40th annual Design Automation Conference
ConCISe: A Compiler-Driven CPLD-Based Instruction Set Accelerator

FCCM '99 Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Processor Acceleration Through Automated Instruction Set Customization

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Characterizing embedded applications for instruction-set extensible processors

Proceedings of the 41st annual Design Automation Conference
FITS: framework-based instruction-set tuning synthesis for embedded application specific processors

Proceedings of the 41st annual Design Automation Conference
Scalable custom instructions identification for instruction-set extensible processors

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
The MOLEN Polymorphic Processor

IEEE Transactions on Computers
Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors

Proceedings of the 32nd annual international symposium on Computer Architecture
Exploiting pipelining to relax register-file port constraints of instruction-set extensions

Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Architecture and compilation for data bandwidth improvement in configurable embedded processors

ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
A design flow for configurable embedded processors based on optimized instruction set extension synthesis

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Bypass aware instruction scheduling for register file power reduction

Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Exploiting forwarding to improve data bandwidth of instruction-set extensions

Proceedings of the 43rd annual Design Automation Conference
A Software-Configurable Processor Architecture

IEEE Micro
Designing SOCs with Configured Cores: Unleashing the Tensilica Xtensa and Diamond Cores (Systems on Silicon)

Designing SOCs with Configured Cores: Unleashing the Tensilica Xtensa and Diamond Cores (Systems on Silicon)
RISPP: rotating instruction set processing platform

Proceedings of the 44th annual Design Automation Conference
An efficient framework for dynamic reconfiguration of instruction-set customization

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Increasing data-bandwidth to instruction-set extensions through register clustering

Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
An Energy-Efficient Processor Architecture for Embedded Systems

IEEE Computer Architecture Letters
A design flow for architecture exploration and implementation of partially reconfigurable processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
AnySP: anytime anywhere anyway signal processing

Proceedings of the 36th annual international symposium on Computer architecture
A Generic Design Flow for Application Specific Processor Customization through Instruction-Set Extensions (ISEs)

SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Operand Registers and Explicit Operand Forwarding

IEEE Computer Architecture Letters
Conservation cores: reducing the energy of mature computations

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
The Instruction-Set Extension Problem: A Survey

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
PEPSC: A Power-Efficient Processor for Scientific Computing

PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Code generation for STA architecture

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
QsCores: trading dark silicon for scalable energy efficiency with quasi-specific cores

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
FISH: Fast Instruction SyntHesis for Custom Processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Energy efficient special instruction support in an embedded processor with compact isa

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Scheduling for register file energy minimization in explicit datapath architectures

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

In application-specific processor design, a common approach to improve performance and efficiency is to use special instructions that execute complex operation patterns. However, in a generic embedded processor with compact Instruction Set Architecture (ISA), these special instructions may lead to large overhead such as: (i) more bits are needed to encode the extra opcodes and operands, resulting in wider instructions; (ii) more Register File (RF) ports are required to provide the extra operands to the function units. Such overhead may increase energy consumption considerably. In this article, we propose to support flexible operation pair patterns in a processor with a compact 24-bit RISC-like ISA using: (i) a partially reconfigurable decoder that exploits the pattern locality to reduce opcode space requirement; (ii) a software-controlled bypass network to reduce operand encoding bit and RF port requirement. An energy-aware compiler backend is designed for the proposed architecture that performs pattern selection and bypass-aware scheduling to generate energy-efficient codes. Though the proposed design imposes extra constraints on the operation patterns, the experimental results show that for benchmark applications from different domains, the average dynamic instruction count is reduced by over 25%, which is only about 2% less than the architecture without such constraints. The proposed architecture reduces total energy by an average of 15.8% compared to the RISC baseline, while the one without constraints achieves almost no improvement due to its high overhead. When high performance is required, the proposed architecture is able to achieve a speedup of 13.8% with 13.1% energy reduction compared to the baseline by introducing multicycle SFU operations.