Instruction packing: reducing power and delay of the dynamic scheduling logic

Authors:
Joseph J. Sharkey;Dmitry V. Ponomarev;Kanad Ghose;Oguz Ergin
Affiliations:
State University of New York, Binghamton, NY;State University of New York, Binghamton, NY;State University of New York, Binghamton, NY;Intel Barcelona Research Center, Barcelona, Spain
Venue:
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Year:
2005

Citing 34
Cited 10

Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
Power considerations in the design of the Alpha 21264 microprocessor

DAC '98 Proceedings of the 35th annual Design Automation Conference
A low-complexity issue logic

Proceedings of the 14th international conference on Supercomputing
On pipelining dynamic instruction scheduling logic

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A circuit level implementation of an adaptive issue queue for power-aware microprocessors

GLSVLSI '01 Proceedings of the 11th Great Lakes symposium on VLSI
Reducing the complexity of the issue logic

ICS '01 Proceedings of the 15th international conference on Supercomputing
Energy-effective issue logic

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Efficient dynamic scheduling through tag elimination

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A large, fast instruction window for tolerating cache misses

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A scalable instruction queue design using dependence chains

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Select-free instruction scheduling logic

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Energy-efficient hybrid wakeup logic

Proceedings of the 2002 international symposium on Low power electronics and design
The PowerPC 604 RISC microprocessor

IEEE Micro
Hierarchical Scheduling Windows

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Half-price architecture

Proceedings of the 30th annual international symposium on Computer architecture
Energy efficient co-adaptive instruction fetch and issue

Proceedings of the 30th annual international symposium on Computer architecture
Cyclone: a broadcast-free dynamic instruction scheduler with selective replay

Proceedings of the 30th annual international symposium on Computer architecture
Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Inherently lower-power high-performance superscalar architectures

Inherently lower-power high-performance superscalar architectures
Macro-op Scheduling: Relaxing Scheduling Loop Constraints

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Energy-efficient issue queue design

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Energy Efficient Comparators for Superscalar Datapaths

IEEE Transactions on Computers
Scaling the issue window with look-ahead latency prediction

Proceedings of the 18th annual international conference on Supercomputing
Wire Delay is Not a Problem for SMT (In the Near Future)

Proceedings of the 31st annual international symposium on Computer architecture
Continual flow pipelines

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Defining Wakeup Width for Efficient Dynamic Scheduling

ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Dataflow Mini-Graphs: Amplifying Superscalar Capacity and Bandwidth

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Out-of-Order Commit Processors

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Low-Complexity Distributed Issue Queue

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Reducing the Scheduling Critical Cycle Using Wakeup Prediction

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Exploring Wakeup-Free Instruction Scheduling

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Non-uniform instruction scheduling

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Instruction recirculation: eliminating counting logic in wakeup-free schedulers

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

Instruction packing: Toward fast and energy-efficient instruction scheduling

ACM Transactions on Architecture and Code Optimization (TACO)
Energy-efficient dynamic instruction scheduling logic through instruction grouping

Proceedings of the 2006 international symposium on Low power electronics and design
A framework for reducing instruction scheduling overhead in dynamic compilers

CASCON '06 Proceedings of the 2006 conference of the Center for Advanced Studies on Collaborative research
Exploiting Operand Availability for Efficient Simultaneous Multithreading

IEEE Transactions on Computers
Scalable Dynamic Instruction Scheduler through Wake-Up Spatial Locality

IEEE Transactions on Computers
Harnessing horizontal parallelism and vertical instruction packing of programs to improve system overall efficiency

Proceedings of the conference on Design, automation and test in Europe
Streamlining long latency instructions for seamlessly combined out-of-order and in-order execution

Microprocessors & Microsystems
Energy-efficient dynamic instruction scheduling logic through instruction grouping

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Non-uniform instruction scheduling

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Instruction recirculation: eliminating counting logic in wakeup-free schedulers

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

The instruction scheduling logic used in modern superscalar microprocessors often relies on associative searching of the issue queue entries to dynamically wakeup instructions for the execution. Traditional designs use one issue queue entry for each instruction, regardless of the actual number of operands actively used in the wakeup process. In this paper we propose Instruction Packing - a novel microarchitectural technique that reduces both the delay and the power consumption of the issue queue by sharing the associative part of an issue queue entry between two instructions, each with at most one non-ready register source operand at the time of dispatch. Our results show that Instruction Packing provides a 39% reduction of the whole issue queue power and 21.6% reduction in the wakeup delay with as little as 0.4% IPC degradation on the average across the simulated SPEC benchmarks