Structure of Computers and Computations
Structure of Computers and Computations
High-bandwidth/low-latency temporary storage for supercomputers
High-bandwidth/low-latency temporary storage for supercomputers
Multiple instruction issue and single-chip processors
MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture
Hypernode reduction modulo scheduling
Proceedings of the 28th annual international symposium on Microarchitecture
Multiple-banked register file architectures
Proceedings of the 27th annual international symposium on Computer architecture
Two-level hierarchical register file organization for VLIW processors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Integrating superscalar processor components to implement register caching
ICS '01 Proceedings of the 15th international conference on Supercomputing
Reducing the complexity of the register file in dynamic superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Exploiting Value Locality in Physical Register Files
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Use-Based Register Caching with Decoupled Indexing
Proceedings of the 31st annual international symposium on Computer architecture
Proceedings of the 31st annual international symposium on Computer architecture
Software and hardware techniques to optimize register file utilization in VLIW architectures
International Journal of Parallel Programming
Register port complexity reduction in wide-issue processors with selective instruction execution
Microprocessors & Microsystems
Unified microprocessor core storage
Proceedings of the 4th international conference on Computing frontiers
Asymmetrically banked value-aware register files for low-energy and high-performance
Microprocessors & Microsystems
Power-efficient clustering via incomplete bypassing
Proceedings of the 13th international symposium on Low power electronics and design
A Multi-Shared Register File Structure for VLIW Processors
Journal of Signal Processing Systems
Exploiting narrow-width values for thermal-aware register file designs
Proceedings of the Conference on Design, Automation and Test in Europe
Energy-efficient mechanisms for managing thread context in throughput processors
Proceedings of the 38th annual international symposium on Computer architecture
A compile-time managed multi-level register file hierarchy
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
A Hierarchical Thread Scheduler and Register File for Energy-Efficient Throughput Processors
ACM Transactions on Computer Systems (TOCS)
Hi-index | 0.00 |
Simulations of scientific programs running on traditional scientific computer architectures show that execution with hundreds of registers can be more than twice as fast as execution with only eight registers. In addition, execution with a small number of fast registers and hundreds of slower registers can be as fast as execution with hundreds of fast registers. A hierarchical organization of fast and slow registers is presented, register-allocation strategies are discussed, and a novel, indirect, register-addressing mechanism is described.