Machine organization of the IBM RISC System/6000 processor
IBM Journal of Research and Development
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Communications of the ACM - Special issue on computer architecture
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Software-Directed Register Deallocation for Simultaneous Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Evaluating the Use of Register Queues in Software Pipelined Loops
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Optimization for the Intel® Itanium® architecture register stack
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
The Named-State Register File: Implementation and Performance
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Increasing the number of effective registers in a low-power processor using a windowed register file
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Hardware-managed register allocation for embedded processors
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Differential register allocation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Partitioning Variables across Register Windows to Reduce Spill Code in a Low-Power Processor
IEEE Transactions on Computers
Efficient Use of Invisible Registers in Thumb Code
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
VICTORIA: VMX indirect compute technology oriented towards in-line acceleration
Proceedings of the 3rd conference on Computing frontiers
Minimizing bank selection instructions for partitioned memory architecture
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Allocating architected registers through differential encoding
ACM Transactions on Programming Languages and Systems (TOPLAS)
Register pointer architecture for efficient embedded processors
Proceedings of the conference on Design, automation and test in Europe
Minimal placement of bank selection instructions for partitioned memory architectures
ACM Transactions on Embedded Computing Systems (TECS)
Optimal placement of bank selection instructions in polynomial time
Proceedings of the 16th International Workshop on Software and Compilers for Embedded Systems
Hi-index | 0.02 |
Code optimization and scheduling for superscalar and superpipelined processors often increase the register requirement of programs. For existing instruction sets with a small to moderate number of registers, this increased register requirement can be a factor that limits the effectivess of the compiler. In this paper, we introduce a new architectural method for adding a set of extended registers into an architecture. Using a novel concept of connection, this method allows the data stored in the extended registers to be accessed by instructions that apparently reference core registers. Furthermore, we address the technical issues involved in applying the new method to an architecture: instruction set extension, procedure call convention, context switching considerations, upward compatibility, efficient implementation, compiler support, and performance. Experimental results based on a prototype compiler and execution driven simulation show that the proposed method can significantly improve the performance of superscalar processors with a small or moderate number of registers.