SEED: scalable, efficient enforcement of dependences

Authors:
Francisco J. Mesa-Martínez;Michael C. Huang;Jose Renau
Affiliations:
University of California Santa Cruz;University of Rochester;University of California Santa Cruz
Venue:
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Year:
2006

Citing 25
Cited 1

Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
Power considerations in the design of the Alpha 21264 microprocessor

DAC '98 Proceedings of the 35th annual Design Automation Conference
A low-complexity issue logic

Proceedings of the 14th international conference on Supercomputing
Closing the gap between ASIC and custom: an ASIC perspective

Proceedings of the 37th Annual Design Automation Conference
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Optimization of high-performance superscalar architectures for energy efficiency

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Reducing energy requirements for instruction issue and dispatch in superscalar microprocessors (poster session)

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Energy-effective issue logic

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
A large, fast instruction window for tolerating cache misses

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
A scalable instruction queue design using dependence chains

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Select-free instruction scheduling logic

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
A high-speed dynamic instruction scheduling scheme for superscalar processors

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Reducing the complexity of the register file in dynamic superscalar processors

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Energy-efficient hybrid wakeup logic

Proceedings of the 2002 international symposium on Low power electronics and design
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
Half-price architecture

Proceedings of the 30th annual international symposium on Computer architecture
Banked multiported register files for high-frequency superscalar microprocessors

Proceedings of the 30th annual international symposium on Computer architecture
Cyclone: a broadcast-free dynamic instruction scheduler with selective replay

Proceedings of the 30th annual international symposium on Computer architecture
Data-Flow Prescheduling for Large Instruction Windows in Out-of-Order Processors

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Scalable Hardware Memory Disambiguation for High ILP Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Characterizing the Effects of Transient Faults on a High-Performance Processor Pipeline

DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Slackened Memory Dependence Enforcement: Combining Opportunistic Forwarding with Decoupled Verification

Proceedings of the 33rd annual international symposium on Computer Architecture
Substituting associative load queue with simple hash tables in out-of-order microprocessors

Proceedings of the 2006 international symposium on Low power electronics and design
POWER4 system microarchitecture

IBM Journal of Research and Development

Comparing FPGA vs. custom cmos and the impact on processor microarchitecture

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

Instruction issue logic is a critical component in modern high-performance out-of-order processors. The ever increasing latencies found in modern processors, mostly associated with memory accesses and longer pipelines, can be attenuated using large issue queues. Conventional designs rely on atomic wakeup-select cycles to ensure compact scheduling. These designs must aggressively utilize broadcasting, compaction, and heavily-ported structures that scale poorly in terms of both power consumption and access tim.To provide high scheduling flexibility and large instruction capacity without incurring prohibitive latency and energy overhead, we propose a novel scheme that uses an out-of-order, broadcast-free instruction wakeup block feeding an in-order scheduler. Multi-banked, index-based structures are used throughout this scheme to provide a high degree of scalability while achieving efficient dependence tracking, resulting in good overall performance and energy efficiency. We call this design "Scalable, Efficient Enforcement of Dependences (SEED)". We present a detailed design and analysis of SEED through an extensive evaluation. Compared to a conventional issue queue design, which is assumed favorably to scale in size without any impact on cycle time, the performance degradation of our design is 3% for both INT and FP suites of SPEC CPU2000. For such a small performance cost, SEED enjoys a 19% reduction in total chip power consumption for a 32-entry configuration. We also synthesize SEED and a conventional issue logic with 90nm standard cell logic. Synthesis results show that SEED can cycle twice the speed of a conventional issue logic of equivalent size. Cycling at the same frequency, SEED consumes ten times less dynamic power and five times less static power while achieving substantial area savings.