Register pointer architecture for efficient embedded processors

Authors:
JongSoo Park;Sung-Boem Park;James D. Balfour;David Black-Schaffer;Christos Kozyrakis;William J. Dally
Affiliations:
Stanford University;Stanford University;Stanford University;Stanford University;Stanford University;Stanford University
Venue:
Proceedings of the conference on Design, automation and test in Europe
Year:
2007

Citing 10
Cited 5

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Overlapped loop support in the Cydra 5

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Register connection: a new approach to adding registers into instruction set architectures

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor

Digital Technical Journal
Evaluating the Use of Register Queues in Software Pipelined Loops

IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Vectorizing for a SIMdD DSP architecture

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Increasing the number of effective registers in a low-power processor using a windowed register file

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
The IBM system/360 model 91: machine philosophy and instruction-handling

IBM Journal of Research and Development

VEGAS: soft vector processor with scratchpad memory

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Scalability evaluation of a polymorphic register file: A CG case study

ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
An efficient non-blocking multithreaded embedded system

ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Code generation for an application-specific VLIW processor with clustered, addressable register files

Proceedings of the 10th Workshop on Optimizations for DSP and Embedded Systems
Separable 2d convolution with polymorphic register files

ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Conventional register file architectures cannot optimally exploit temporal locality in data references due to their limited capacity and static encoding of register addresses in instructions. In conventional embedded architectures, the register file capacity cannot be increased without resorting to longer instruction words. Similarly, loop unrolling is often required to exploit locality in the register file accesses across iterations because naming registers statically is inflexible. Both optimizations lead to significant code size increases, which is undesirable in embedded systems. In this paper, we introduce the Register Pointer Architecture (RPA), which allows registers to be accessed indirectly through register pointers. Indirection allows a larger register file to be used without increasing the length of instruction words. Additional register file capacity allows many loads and stores, such as those introduced by spill code, to be eliminated, which improves performance and reduces energy consumption. Moreover, indirection affords additional flexibility in naming registers, which reduces the need to apply loop unrolling in order to maximize reuse of register allocated variables.