The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
Dynamic vectorization: a mechanism for exploiting far-flung ILP in ordinary programs
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
A circuit level implementation of an adaptive issue queue for power-aware microprocessors
GLSVLSI '01 Proceedings of the 11th Great Lakes symposium on VLSI
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Micro-operation cache: a power aware frontend for the variable instruction length ISA
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Instruction flow-based front-end throttling for power-aware high-performance processors
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Saving energy with just in time instruction delivery
Proceedings of the 2002 international symposium on Low power electronics and design
Energy efficient co-adaptive instruction fetch and issue
Proceedings of the 30th annual international symposium on Computer architecture
Reducing instruction fetch energy with backwards branch control information and buffering
Proceedings of the 2003 international symposium on Low power electronics and design
Execution cache-based microarchitecture power-efficient superscalar processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Low power microarchitecture with instruction reuse
Proceedings of the 5th conference on Computing frontiers
Reusing cached schedules in an out-of-order processor with in-order issue logic
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Hi-index | 0.00 |
As power dissipation inexorably becomes the major bottleneck in system integration and reliability, the front-end instruction delivery path in a traditional out-of-order superscalar processor needs to deliver high application performance in an energy-effective manner. This challenge can be addressed by efficiently reusing the work of fetch and decode performed during preceding loop iterations and resident mostly within the processor itself. As a large percentage of the instructions currently under fetch have previously dispatched copies resident in the Reorder Buffer (ROB), in this paper we develop a mechanism to utilize the ROB as a storage location for previously decoded instructions. Thus instructions can be fed directly from the ROB into the rename and issue stages, enabling the gating off of the fetch and decode logic for large periods of time so as to deliver significant power savings. Power and performance criticality of the ROB requires an efficient reuse identification mechanism; we outline such a cost-efficient Reuse Identification Unit (RIU) which enables effective identification of the matches between the ROB entries and the instructions currently under fetch. Simulation results on both multimedia and SPEC 2000 benchmarks confirm that incorporating the proposed technique on traditional out-of-order superscalar processors results in not only a sight improvement in performance, but also significant savings in the overall system power dissipation, achieved within a limited hardware budget.