Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Reducing the complexity of the register file in dynamic superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Banked multiported register files for high-frequency superscalar microprocessors
Proceedings of the 30th annual international symposium on Computer architecture
VSV: L2-Miss-Driven Variable Supply-Voltage Scaling for Low Power
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Increasing Processor Performance Through Early Register Release
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Evaluation of Speed and Area of Clustered VLIW Processors
VLSID '05 Proceedings of the 18th International Conference on VLSI Design held jointly with 4th International Conference on Embedded Systems Design
Efficient design space exploration of high performance embedded out-of-order processors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
An L2-miss-driven early register deallocation for SMT processors
Proceedings of the 21st annual international conference on Supercomputing
Proceedings of the 7th ACM international conference on Computing frontiers
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Run-time reconfiguration of expandable cache for embedded systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
With CMOS scaling leading to ever increasing levels of transistor integration on a chip, designers of high-performance embedded processors have ample area available to increase processor resources in order to improve performance. However, increasing resource sizes can increase power dissipation and also reduce access time, which can limit maximum achievable operating frequency. In this paper, we explore optimizations for the processor register file (RF), to improve performance and reduce the energy-delay product. We show that while increasing the size of the RF can potentially increase the IPC, overall it results in an increase in program execution time. In response we propose L2MRFS -- a dynamic register file resizing scheme in tandem with frequency scaling, which exploits L2 cache misses to noticeably improve processor performance (11% on average) and also significantly reduce the energy-delay product (7%).