Increasing data-bandwidth to instruction-set extensions through register clustering

Authors:
Kingshuk Karuri;Anupam Chattopadhyay;Manuel Hohenauer;Rainer Leupers;Gerd Ascheid;Heinrich Meyr
Affiliations:
RWTH Aachen University, Aachen, Germany;RWTH Aachen University, Aachen, Germany;RWTH Aachen University, Aachen, Germany;RWTH Aachen University, Aachen, Germany;RWTH Aachen University, Aachen, Germany;RWTH Aachen University, Aachen, Germany
Venue:
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Year:
2007

Citing 12
Cited 11

Automatic application-specific instruction-set extensions under microarchitectural constraints

Proceedings of the 40th annual Design Automation Conference
Automatic generation of application specific processors

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Processor Acceleration Through Automated Instruction Set Customization

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Introduction of local memory elements in instruction set extensions

Proceedings of the 41st annual Design Automation Conference
Scalable custom instructions identification for instruction-set extensible processors

Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
ISEGEN: Generation of High-Quality Instruction Set Extensions by Iterative Improvement

Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Architecture Exploration for a Reconfigurable Architecture Template

IEEE Design & Test
Exploiting pipelining to relax register-file port constraints of instruction-set extensions

Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems
Automatic identification of application-specific functional units with architecturally visible storage

Proceedings of the conference on Design, automation and test in Europe: Proceedings
A design flow for configurable embedded processors based on optimized instruction set extension synthesis

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Exploiting forwarding to improve data bandwidth of instruction-set extensions

Proceedings of the 43rd annual Design Automation Conference
Customizable Embedded Processors: Design Technologies and Applications

Customizable Embedded Processors: Design Technologies and Applications

Processor Description Languages

Processor Description Languages
A design flow for architecture exploration and implementation of partially reconfigurable processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Generic Design Flow for Application Specific Processor Customization through Instruction-Set Extensions (ISEs)

SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Way Stealing: cache-assisted automatic instruction set extensions

Proceedings of the 46th Annual Design Automation Conference
Memory organization and data layout for instruction set extensions with architecturally visible storage

Proceedings of the 2009 International Conference on Computer-Aided Design
Design-space exploration of resource-sharing solutions for custom instruction set extensions

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Fast, nearly optimal ISE identification with I/O serialization through maximal clique enumeration

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Virtual ways: efficient coherence for architecturally visible storage in automatic instruction set extensions

HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Energy efficient special instruction support in an embedded processor with compact isa

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
An energy-efficient method of supporting flexible special instructions in an embedded processor with compact ISA

ACM Transactions on Architecture and Code Optimization (TACO)
Ingredients of adaptability: a survey of reconfigurable processors

VLSI Design

Quantified Score

Hi-index	0.00

Visualization

Abstract

The conflicting requirements of performance and flexibility in today's embedded system market are forcing system designers to use more and more of the so called Configurable or Customizable processor cores. Such processors tend to meet the demanding performance constraints by accommodating application specific Instruction-Set Extensions (ISEs) which have, naturally, become a vital component of current processor customization flows. One major bottleneck in maximizing ISE performance is the limitation on the data-bandwidth between the General Purpose Register(GPR) file and the ISEs. For improved performance, it is desirable to have a large data-bandwidth from the GPRs to ISEs. However, the tight area constraints of modern embedded processors often restrict the GPR I/O of ISEs to save port area of the register files. This paper presents a novel approach to increase the GPR I/O of ISEs without significantly increasing the size of the GPR files. This is achieved by applying the concept of register clustering, common in many VLIW architectures, to single-issue processors with high performance ISEs. Such clustering often causes extra register moves in compiled code. This work also presents an algorithm to minimize such register moves. The benchmark results presented in this paper show that our solution can significantly reduce the area overhead of many-port GPR files without sacrificing the performance improvements through ISEs.