Low-power, low-complexity instruction issue using compiler assistance

Authors:
Madhavi G. Valluri;Lizy K. John;Kathryn S. McKinley
Affiliations:
The University of Texas at Austin, TX;The University of Texas at Austin, TX;The University of Texas at Austin, TX
Venue:
Proceedings of the 19th annual international conference on Supercomputing
Year:
2005

Citing 27
Cited 3

A fill-unit approach to multiple instruction issue

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Exploiting instruction level parallelism in processors by caching scheduled groups

Proceedings of the 24th annual international symposium on Computer architecture
Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
Power considerations in the design of the Alpha 21264 microprocessor

DAC '98 Proceedings of the 35th annual Design Automation Conference
Pipeline gating: speculation control for energy reduction

Proceedings of the 25th annual international symposium on Computer architecture
A low-complexity issue logic

Proceedings of the 14th international conference on Supercomputing
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Dynamo: a transparent dynamic optimization system

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Inherently Lower-Power High-Performance Superscalar Architectures

IEEE Transactions on Computers
Power and energy reduction via pipeline balancing

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Energy-effective issue logic

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Power reduction through work reuse

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
A large, fast instruction window for tolerating cache misses

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A scalable instruction queue design using dependence chains

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Reducing power with dynamic critical path information

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Recovery Mechanism for Latency Misprediction

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Improving quasi-dynamic schedules through region slip

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Combining Software and Hardware Monitoring for Improved Power and Performance Tuning

INTERACT '03 Proceedings of the Seventh Workshop on Interaction between Compilers and Computer Architectures
Cyclone: a broadcast-free dynamic instruction scheduler with selective replay

Proceedings of the 30th annual international symposium on Computer architecture
Exploiting compiler-generated schedules for energy savings in high-performance processors

Proceedings of the 2003 international symposium on Low power electronics and design
Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Power Awareness through Selective Dynamically Optimized Traces

Proceedings of the 31st annual international symposium on Computer architecture
Scaling to the End of Silicon with EDGE Architectures

Computer
Software Directed Issue Queue Power Reduction

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Low-Complexity Distributed Issue Queue

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research

IEEE Computer Architecture Letters

Impact of virtual execution environments on processor energy consumption and hardware adaptation

Proceedings of the 2nd international conference on Virtual execution environments
Hybrid-scheduling for reduced energy consumption in high-performance processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Discerning the dominant out-of-order performance advantage: is it speculation or dynamism?

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In an out-of-order issue processor, instructions are dynamically reordered and issued to function units in their data-ready order rather than their original program order to achieve high performance. The logic that facilitates dynamic issue is one of the most power-hungry and time-critical components in a typical out-of-order issue processor.This paper develops a cooperative hardware/software technique to reduce complexity and energy consumption of the issue logic. The proposed scheme is based on the observation that not all instructions in a program require the same amount of dynamic reordering. Instructions that belong to basic blocks for which the compiler can perform near-optimal sche- duling do not need any intra-block instruction reordering but require only inter-block instruction overlap. In contrast, blocks where the compiler is limited by artificial dependences and memory misses require both intra-block and inter-block instruction reordering. The proposed Reorder-Sensitive Issue Scheme utilizes a novel compile-time analyzer to evaluate the quality of schedules generated by the static scheduler and to estimate the dynamic reordering requirement of instructions within each basic block. At the micro-architecture-level, we propose a novel issue queue that exploits the varying dynamic scheduling requirement of basic blocks to lower the power dissipation and complexity of the dynamic issue hardware.An evaluation of the technique on several SPEC integer benchmarks indicates that we can reduce the energy consumption in the issue queue on average by 72% with only 5% performance degradation Additionally, the proposed issue hardware is significantly less complex when compared to a conventional monolithic out-of-order issue queue, providing the potential for high clock speeds.