A high-speed dynamic instruction scheduling scheme for superscalar processors

Authors:
Masahiro Goshima;Kengo Nishino;Toshiaki Kitamura;Yasuhiko Nakashima;Shinji Tomita;Shin-ichiro Mori
Affiliations:
Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, Japan;Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, Japan;Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, Japan;Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, Japan;Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, Japan;Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, Japan
Venue:
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Year:
2001

Citing 5
Cited 19

Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
The multicluster architecture: reducing cycle time through partitioning

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A low-complexity issue logic

Proceedings of the 14th international conference on Supercomputing
Circuits for wide-window superscalar processors

Proceedings of the 27th annual international symposium on Computer architecture
The MIPS R10000 Superscalar Microprocessor

IEEE Micro

A scalable instruction queue design using dependence chains

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Energy-efficient hybrid wakeup logic

Proceedings of the 2002 international symposium on Low power electronics and design
Front-End Policies for Improved Issue Efficiency in SMT Processors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Cyclone: a broadcast-free dynamic instruction scheduler with selective replay

Proceedings of the 30th annual international symposium on Computer architecture
Macro-op Scheduling: Relaxing Scheduling Loop Constraints

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Increasing design space of the instruction queue with tag coding

GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
An efficient wakeup design for energy reduction in high-performance superscalar processors

Proceedings of the 2nd conference on Computing frontiers
A New Pointer-based Instruction Queue Design and Its Power-Performance Evaluation

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
SEED: scalable, efficient enforcement of dependences

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
A scalable low power issue queue for large instruction window processors

Proceedings of the 20th annual international conference on Supercomputing
Matrix scheduler reloaded

Proceedings of the 34th annual international symposium on Computer architecture
Scalable Dynamic Instruction Scheduler through Wake-Up Spatial Locality

IEEE Transactions on Computers
A partitioned instruction queue to reduce instruction wakeup energy

International Journal of High Performance Computing and Networking
A distributed processor state management architecture for large-window processors

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
An energy-efficient checkpointing mechanism for out of order commit processor

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Design and optimization of the store vectors memory dependence predictor

ACM Transactions on Architecture and Code Optimization (TACO)
A physical-level study of the compacted matrix instruction scheduler for dynamically-scheduled superscalar processors

SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
Wake-up logic optimizations through selective match and wakeup range limitation

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Low complexity out-of-order issue logic using static circuits

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The wakeup logic is a part of the issuing window and is responsible to manage the ready flags of the operands for dynamic instruction scheduling. The conventional wakeup logic is based on association, and composed of a RAM and a CAM. Since the logic is not pipelinable and the delays of these memories are dominated by the wire delays, the logic will be more critical with deeper pipelines and smaller feature sizes. This paper describes a new scheduling scheme not based on the association but on matrices which represent the dependences between instructions. Since the update logic of the matrices detects the dependencies between instructions as the register renaming logic does, the wakeup operation is realized by just reading the matrices. This paper also describes a technique to reduce the effective size of the matrices for small IPC penalties. We designed the layouts of the logics guided by a 0.18µm CMOS design rule provided by Fujitsu Limited, and calculated the delays. We also evaluated the penalties by cycle-level simulation. The results show that our scheme achieves 2.7GHz clock speed for the IPC degradation of about 1%.