An investigation of the performance of various dynamic scheduling techniques
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Dynamic IPC/clock rate optimization
Proceedings of the 25th annual international symposium on Computer architecture
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Load latency tolerance in dynamically scheduled processors
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 14th international conference on Supercomputing
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Circuits for wide-window superscalar processors
Proceedings of the 27th annual international symposium on Computer architecture
A circuit level implementation of an adaptive issue queue for power-aware microprocessors
GLSVLSI '01 Proceedings of the 11th Great Lakes symposium on VLSI
Reducing the complexity of the issue logic
ICS '01 Proceedings of the 15th international conference on Supercomputing
Focusing processor policies via critical-path prediction
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Efficient dynamic scheduling through tag elimination
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A large, fast instruction window for tolerating cache misses
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A scalable instruction queue design using dependence chains
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Select-free instruction scheduling logic
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
A high-speed dynamic instruction scheduling scheme for superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Energy-efficient hybrid wakeup logic
Proceedings of the 2002 international symposium on Low power electronics and design
The PowerPC 604 RISC microprocessor
IEEE Micro
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
IEEE Micro
The Alpha 21264 Microprocessor
IEEE Micro
Integrating Adaptive On-Chip Storage Structures for Reduced Dynamic Power
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Hierarchical Scheduling Windows
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Cyclone: a broadcast-free dynamic instruction scheduler with selective replay
Proceedings of the 30th annual international symposium on Computer architecture
Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Energy-efficient issue queue design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Defining Wakeup Width for Efficient Dynamic Scheduling
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
An efficient wakeup design for energy reduction in high-performance superscalar processors
Proceedings of the 2nd conference on Computing frontiers
Exploring Wakeup-Free Instruction Scheduling
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Instruction packing: reducing power and delay of the dynamic scheduling logic
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Hi-index | 14.98 |
In a high-performance superscalar processor, the instruction scheduler often comes with poor scalability and high complexity due to the expensive wakeup operation. From detailed simulation-based analyses, we find that 95% of the wakeup distances between two dependent instructions are short, in the range of 16 instructions, and 99% are in the range of 31 instructions. We apply this wakeup spatial locality to the design of conventional CAM-based and matrix-based wakeup logic respectively. By limiting the wakeup coverage to i + 16 instructions where 0 =