The Named-State Register File: Implementation and Performance

Authors:
P. R. Nuth;W. J. Dally
Affiliations:
-;-
Venue:
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Year:
1995

Citing 20
Cited 5

Global register allocation at link time

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Toward a dataflow/von Neumann hybrid architecture

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
MASA: a multithreaded processor architecture for parallel symbolic computing

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Experience with CST: programming and implementation

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Can dataflow subsume von Neumann computing?

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Exploring the benefits of multiple hardware contexts in a multiprocessor architecture: preliminary results

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Exploiting large register sets

Microprocessors & Microsystems - Special issue on applying and implementing RISC
Architectural support for reduced register saving/restoring in single-window register files

ACM Transactions on Computer Systems (TOCS)
Fine-grain parallelism with minimal hardware support: a compiler-controlled threaded abstract machine

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Processor coupling: integrating compile time and runtime scheduling for parallelism

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Register relocation: flexible contexts for multithreading

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Multiple threads in cyclic register windows

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Register connection: a new approach to adding registers into instruction set architectures

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Impact of sharing-based thread placement on multithreaded architectures

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The named-state register file

The named-state register file
Monsoon: an explicit token-store architecture

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Sparcle: An Evolutionary Processor Design for Large-Scale Multiprocessors

IEEE Micro
WAITING ALGORITHMS FOR SYNCHRONIZATION IN LARGE-SCALE MULTIPROCESSORS

WAITING ALGORITHMS FOR SYNCHRONIZATION IN LARGE-SCALE MULTIPROCESSORS
Architectural and implementation tradeoffs in the design of multiple-context processors

Architectural and implementation tradeoffs in the design of multiple-context processors
VLSI Processor Architecture

IEEE Transactions on Computers

Software-Directed Register Deallocation for Simultaneous Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems
How to Fake 1000 Registers

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Energy-efficient mechanisms for managing thread context in throughput processors

Proceedings of the 38th annual international symposium on Computer architecture
A Hierarchical Thread Scheduler and Register File for Energy-Efficient Throughput Processors

ACM Transactions on Computer Systems (TOCS)
Compiler support for lightweight context switching

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Context switches are slow in conventional processors because the entire processor state must be saved and restored, even if much of the state is not used before the next context switch. This paper introduces the Named-Stare Register File, a fine-grain associative register tile. The NSF uses hardware and software techniques to efficiently manage registers among sequential or parallel procedure activations. The NSF holds more live data per register than conventional register files, and requires much less spill and reload traffic to switch between concurrent contexts. The NSF speeds execution of some sequential and parallel programs by 9% to 17% over alternative register tile organizations. The NSF has access time comparable to a conventional register file and only adds 5% to the area of a typical processor chip.