Register connection: a new approach to adding registers into instruction set architectures

Authors:
Tokuzo Kiyohara;Scott Mahlke;William Chen;Roger Bringmann;Richard Hank;Sadun Anik;Wen-Mei Hwu
Affiliations:
-;-;-;-;-;-;-
Venue:
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Year:
1993

Citing 4
Cited 15

Machine organization of the IBM RISC System/6000 processor

IBM Journal of Research and Development
The superblock: an effective technique for VLIW and superscalar compilation

The Journal of Supercomputing - Special issue on instruction-level parallelism
The CRAY-1 computer system

Communications of the ACM - Special issue on computer architecture
Register allocation & spilling via graph coloring

SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction

Software-Directed Register Deallocation for Simultaneous Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems
Evaluating the Use of Register Queues in Software Pipelined Loops

IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Optimization for the Intel® Itanium® architecture register stack

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
The Named-State Register File: Implementation and Performance

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Increasing the number of effective registers in a low-power processor using a windowed register file

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Hardware-managed register allocation for embedded processors

Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Differential register allocation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Partitioning Variables across Register Windows to Reduce Spill Code in a Low-Power Processor

IEEE Transactions on Computers
Efficient Use of Invisible Registers in Thumb Code

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
VICTORIA: VMX indirect compute technology oriented towards in-line acceleration

Proceedings of the 3rd conference on Computing frontiers
Minimizing bank selection instructions for partitioned memory architecture

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Allocating architected registers through differential encoding

ACM Transactions on Programming Languages and Systems (TOPLAS)
Register pointer architecture for efficient embedded processors

Proceedings of the conference on Design, automation and test in Europe
Minimal placement of bank selection instructions for partitioned memory architectures

ACM Transactions on Embedded Computing Systems (TECS)
Optimal placement of bank selection instructions in polynomial time

Proceedings of the 16th International Workshop on Software and Compilers for Embedded Systems

Quantified Score

Hi-index	0.02

Visualization

Abstract

Code optimization and scheduling for superscalar and superpipelined processors often increase the register requirement of programs. For existing instruction sets with a small to moderate number of registers, this increased register requirement can be a factor that limits the effectivess of the compiler. In this paper, we introduce a new architectural method for adding a set of extended registers into an architecture. Using a novel concept of connection, this method allows the data stored in the extended registers to be accessed by instructions that apparently reference core registers. Furthermore, we address the technical issues involved in applying the new method to an architecture: instruction set extension, procedure call convention, context switching considerations, upward compatibility, efficient implementation, compiler support, and performance. Experimental results based on a prototype compiler and execution driven simulation show that the proposed method can significantly improve the performance of superscalar processors with a small or moderate number of registers.