Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Overlapped loop support in the Cydra 5
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Circular scheduling: a new technique to perform software pipelining
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Register allocation for software pipelined loops
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Register requirements of pipelined processors
ICS '92 Proceedings of the 6th international conference on Supercomputing
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Partitioned register files for VLIWs: a preliminary analysis of tradeoffs
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Designing the TFP Microprocessor
IEEE Micro
Decoupled access/execute computer architectures
ACM Transactions on Computer Systems (TOCS)
Communications of the ACM - Special issue on computer architecture
IEEE Micro
Using Sacks to Organize Registers in VLIW Machines
CONPAR 94 - VAPP VI Proceedings of the Third Joint International Conference on Vector and Parallel Processing: Parallel Processing
MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Hypernode reduction modulo scheduling
Proceedings of the 28th annual international symposium on Microarchitecture
Quantitative Evaluation of Register Pressure on Software Pipelined Loops
International Journal of Parallel Programming
Effective cluster assignment for modulo scheduling
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Software-Directed Register Deallocation for Simultaneous Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Multiple-banked register file architectures
Proceedings of the 27th annual international symposium on Computer architecture
Two-level hierarchical register file organization for VLIW processors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Low-complexity reorder buffer architecture
ICS '02 Proceedings of the 16th international conference on Supercomputing
A Register File Architecture and Compilation Scheme for Clustered ILP Processors
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Reducing register pressure through LAER algorithm
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Complexity-Effective Reorder Buffer Designs for Superscalar Processors
IEEE Transactions on Computers
Software and hardware techniques to optimize register file utilization in VLIW architectures
International Journal of Parallel Programming
Register file caching for energy efficiency
Proceedings of the 2006 international symposium on Low power electronics and design
Hi-index | 0.00 |
The continuous grow on instruction level parallelism offered by microprocessors requires a large register file and a large number of ports to access it. This paper presents the non-consistent dual register file, an alternative implementation and management of the register file. Non-consistent dual register files support the bandwidth demands and the high register requirements, without penalizing neither access time nor implementation cost. The proposal is evaluated for software pipelined loops and compared against a unified register file. Empirical results show improvements on performance and a noticeable reduction of the density of memory traffic due to a reduction of the spill code. The spill code can in general increase the minimuminitiation interval and decrease loop performance. Additional improvements can be obtained when the operations are scheduled having in mind the register file organization proposed in this paper.