Proceedings of the 24th annual international symposium on Computer architecture
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Power reduction through work reuse
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Scheduling Reusable Instructions for Power Reduction
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Execution cache-based microarchitecture power-efficient superscalar processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Power-efficient instruction delivery through trace reuse
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Reusing cached schedules in an out-of-order processor with in-order issue logic
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Hi-index | 0.00 |
Power consumption has become a very important metric and challenging research topic in the design of microprocessors in the recent years. The goal of this work is to improve power efficiency of superscalar processors through instruction reuse at the execution stage. This paper proposes a new method for reusing instructions when they compose small loops: the loop's instructions are first buffered in the Reorder Buffer and reused afterwards without the need for dynamically unrolling the loop, as commonly implemented by the traditional instruction reusing techniques. The proposed method is implemented with the introduction of two new auxiliary hardware structures in a typical superscalar microarchitecture: a Finite State Machine (FSM), used to detect the reusable loops; and a Log used to store the renaming data for each instruction when the loop is "unrolled". In order to evaluate the proposed method we modified the sim-outorder tool from Simplescalar v3.0 for the PISA, and Wattch v1.02 Power Performance simulators. Several different configurations and benchmarks have been used during the simulations. The obtained results show that by implementing this new method in a superscalar microarchitecture, the power efficiency can be improved without significantly affecting neither the performance nor the cost.