Non-Consistent Dual Register Files to Reduce Register Pressure

Authors:
J. Llosa;M. Valero;E. Ayguade
Affiliations:
-;-;-
Venue:
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Year:
1995

Citing 16
Cited 12

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions, and Trade-Offs

Computer
Overlapped loop support in the Cydra 5

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Circular scheduling: a new technique to perform software pipelining

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Register allocation for software pipelined loops

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Register requirements of pipelined processors

ICS '92 Proceedings of the 6th international conference on Supercomputing
Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Partitioned register files for VLIWs: a preliminary analysis of tradeoffs

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Designing the TFP Microprocessor

IEEE Micro
Decoupled access/execute computer architectures

ACM Transactions on Computer Systems (TOCS)
The CRAY-1 computer system

Communications of the ACM - Special issue on computer architecture
The Mips R4000 Processor

IEEE Micro
Using Sacks to Organize Registers in VLIW Machines

CONPAR 94 - VAPP VI Proceedings of the Third Joint International Conference on Vector and Parallel Processing: Parallel Processing
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Register allocation & spilling via graph coloring

SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction

Hypernode reduction modulo scheduling

Proceedings of the 28th annual international symposium on Microarchitecture
Quantitative Evaluation of Register Pressure on Software Pipelined Loops

International Journal of Parallel Programming
Effective cluster assignment for modulo scheduling

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Software-Directed Register Deallocation for Simultaneous Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems
Multiple-banked register file architectures

Proceedings of the 27th annual international symposium on Computer architecture
Two-level hierarchical register file organization for VLIW processors

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Low-complexity reorder buffer architecture

ICS '02 Proceedings of the 16th international conference on Supercomputing
A Register File Architecture and Compilation Scheme for Clustered ILP Processors

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Reducing register pressure through LAER algorithm

ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Complexity-Effective Reorder Buffer Designs for Superscalar Processors

IEEE Transactions on Computers
Software and hardware techniques to optimize register file utilization in VLIW architectures

International Journal of Parallel Programming
Register file caching for energy efficiency

Proceedings of the 2006 international symposium on Low power electronics and design

Quantified Score

Hi-index	0.00

Visualization

Abstract

The continuous grow on instruction level parallelism offered by microprocessors requires a large register file and a large number of ports to access it. This paper presents the non-consistent dual register file, an alternative implementation and management of the register file. Non-consistent dual register files support the bandwidth demands and the high register requirements, without penalizing neither access time nor implementation cost. The proposal is evaluated for software pipelined loops and compared against a unified register file. Empirical results show improvements on performance and a noticeable reduction of the density of memory traffic due to a reduction of the spill code. The spill code can in general increase the minimuminitiation interval and decrease loop performance. Additional improvements can be obtained when the operations are scheduled having in mind the register file organization proposed in this paper.