The SPARC architecture manual (version 9)
The SPARC architecture manual (version 9)
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
The multicluster architecture: reducing cycle time through partitioning
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The energy complexity of register files
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Multiple-banked register file architectures
Proceedings of the 27th annual international symposium on Computer architecture
Inherently Lower-Power High-Performance Superscalar Architectures
IEEE Transactions on Computers
A low-leakage dynamic multi-ported register file in 0.13mm CMOS
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Reducing the complexity of the register file in dynamic superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The Alpha 21264 Microprocessor
IEEE Micro
Reducing register ports for higher speed and lower energy
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A three dimensional register file for superscalar processors
HICSS '95 Proceedings of the 28th Hawaii International Conference on System Sciences
Very Long Instruction Word architectures and the ELI-512
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Energy-Efficient Register Access
SBCCI '00 Proceedings of the 13th symposium on Integrated circuits and systems design
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
A Scalable Register File Architecture for Dynamically Scheduled Processors
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
The microarchitecture of a low power register file
Proceedings of the 2003 international symposium on Low power electronics and design
Isolating Short-Lived Operands for Energy Reduction
IEEE Transactions on Computers
Speculative software management of datapath-width for energy optimization
Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
A Content Aware Integer Register File Organization
Proceedings of the 31st annual international symposium on Computer architecture
Area and System Clock Effects on SMT/CMP Throughput
IEEE Transactions on Computers
Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Evaluation of Bus Based Interconnect Mechanisms in Clustered VLIW Architectures
Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Scalable cache memory design for large-scale SMT architectures
WMPI '04 Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture
An asymmetric clustered processor based on value content
Proceedings of the 19th annual international conference on Supercomputing
Compiler Directed Early Register Release
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Proceedings of the 33rd annual international symposium on Computer Architecture
SEED: scalable, efficient enforcement of dependences
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Selective writeback: exploiting transient values for energy-efficiency and performance
Proceedings of the 2006 international symposium on Low power electronics and design
Register file caching for energy efficiency
Proceedings of the 2006 international symposium on Low power electronics and design
Register port complexity reduction in wide-issue processors with selective instruction execution
Microprocessors & Microsystems
Using fine grain multithreading for energy efficient computing
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Exploiting virtual registers to reduce pressure on real registers
ACM Transactions on Architecture and Code Optimization (TACO)
IEEE Transactions on Computers
Asymmetrically banked value-aware register files for low-energy and high-performance
Microprocessors & Microsystems
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Proceedings of the 45th annual Design Automation Conference
Power-efficient clustering via incomplete bypassing
Proceedings of the 13th international symposium on Low power electronics and design
Selective writeback: reducing register file pressure and energy consumption
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Thermal-aware post compilation for VLIW architectures
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Evaluation of bus based interconnect mechanisms in clustered VLIW architectures
International Journal of Parallel Programming
Exploring the limits of early register release: Exploiting compiler analysis
ACM Transactions on Architecture and Code Optimization (TACO)
Energy-efficient register caching with compiler assistance
ACM Transactions on Architecture and Code Optimization (TACO)
Virtual registers: reducing register pressure without enlarging the register file
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
A configurable multi-ported register file architecture for soft processor cores
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
Decoupled state-execute architecture
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Exploiting narrow-width values for thermal-aware register file designs
Proceedings of the Conference on Design, Automation and Test in Europe
Performance evaluation of superscalar processor with multi-bank register file using SPEC2000
ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
CRAM: coded registers for amplified multiporting
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
A register allocation framework for banked register files with access constraints
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
An optimized front-end physical register file with banking and writeback filtering
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
Bit-sliced datapath for energy-efficient high performance microprocessors
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Mat-core: a decoupled matrix core extension for general-purpose processors
Neural, Parallel & Scientific Computations
Modular multi-ported SRAM-based memories
Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays
Hi-index | 0.01 |
Multiported register files are a critical component of high-performance superscalar microprocessors. Conventional multiported structures can consume significant power and die area. We examine the designs of banked multiported register files that employ multiple interleaved banks of fewer ported register cells to reduce power and area. Banked register files designs have been shown to provide sufficient bandwidth for a superscalar machine, but previous designs had complex control structures that would likely limit cycle time and add to design complexity. We develop a banked register file with much simpler and faster control logic while only slightly increasing the number of ports per bank. We present area, delay, and energy numbers extracted from layouts of the banked register file. For a four-issue superscalar processor, we show that we can reduce area by a factor of three, access time by 20%, and energy by 40%, while decreasing IPC by less than 5%.