Execution cache-based microarchitecture power-efficient superscalar processors

Authors:
Emil Talpes;Diana Marculescu
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2005

Citing 20
Cited 5

Guarded evaluation: pushing power management to logic synthesis/design

ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Exploiting instruction level parallelism in processors by caching scheduled groups

Proceedings of the 24th annual international symposium on Computer architecture
Complexity-effective superscalar processors

Proceedings of the 24th annual international symposium on Computer architecture
The filter cache: an energy efficient memory structure

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Reducing power in high-performance microprocessors

DAC '98 Proceedings of the 35th annual Design Automation Conference
Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors

ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
A Trace Cache Microarchitecture and Evaluation

IEEE Transactions on Computers - Special issue on cache memory and related problems
Evaluation of Design Options for the Trace Cache Fetch Mechanism

IEEE Transactions on Computers - Special issue on cache memory and related problems
MPS: Miss-Path Scheduling for Multiple-Issue Processors

IEEE Transactions on Computers
The block-based trace cache

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
A static power model for architects

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Micro-operation cache: a power aware frontend for the variable instruction length ISA

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Power reduction through work reuse

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
DRG-cache: a data retention gated-ground cache for low power

Proceedings of the 39th annual Design Automation Conference
The MIPS R10000 Superscalar Microprocessor

IEEE Micro
Filtering Techniques to Improve Trace-Cache Efficiency

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Estimation of Maximum Power-up Current

ASP-DAC '02 Proceedings of the 2002 Asia and South Pacific Design Automation Conference
Leakage and leakage sensitivity computation for combinational circuits

Proceedings of the 2003 international symposium on Low power electronics and design

Power-efficient instruction delivery through trace reuse

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Low power microarchitecture with instruction reuse

Proceedings of the 5th conference on Computing frontiers
LPA: a first approach to the loop processor architecture

HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Reusing cached schedules in an out-of-order processor with in-order issue logic

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Discerning the dominant out-of-order performance advantage: is it speculation or dynamism?

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates a possible solution to the problem of power consumption in superscalar, out-of-order processors by proposing a new microarchitecture, specifically designed to reduce increasing power requirements of high-end processors. More precisely, we show that by modifying the well-established superscalar processor architecture, significant savings can be achieved in terms of power consumption. Our approach aims at limiting the growing amount of power used in a typical processor for dynamic optimizations (including out-of-order scheduling and register renaming). Our proposed approach achieves significant power savings by reusing as much as possible from the work done by the front-end of a typical superscalar, out-of-order pipeline, via the use of a special cache nested deeply into the processor structure. By reusing instructions that are already decoded, reordered, and have their registers already renamed, the front end of the pipeline can be turned off for large periods of time with significant savings in the overall power consumption. Experimental results show up to 35% (30% on average) savings in average energy per committed instruction, and 35% (20% on average) savings in energy-delay product, with about 9% average performance loss, over a large spectrum of SPEC95 and SPEC2000 benchmarks.