Instruction fetch deferral using static slack

Authors:
Gregory A. Muthler;David Crowe;Sanjay J. Patel;Steven S. Lumetta
Affiliations:
University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign
Venue:
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Year:
2002

Citing 9
Cited 3

Efficient instruction scheduling for a pipelined architecture

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Assigning confidence to conditional branch predictions

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Improving superscalar instruction dispatch and issue by exploiting dynamic code sequences

Proceedings of the 24th annual international symposium on Computer architecture
Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A dynamic multithreading processor

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A scalable front-end architecture for fast instruction delivery

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Increasing the size of atomic instruction blocks using control flow assertions

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Slack: maximizing performance under technological constraints

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Skipper: a microarchitecture for exploiting control-flow independence

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture

Stall cycle redistribution in a transparent fetch pipeline

Proceedings of the 2006 international symposium on Low power electronics and design
Ginger: control independence using tag rewriting

Proceedings of the 34th annual international symposium on Computer architecture
Accurate critical path prediction via random trace construction

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present an approach to boosting performance and tolerating latency by deferring non-critical instructions into a deferred queue for later processing. As such, instruction deferral allows more critical instructions to be fetched, dispatched, and possibly executed, earlier.We present methods for identifying deferrable instructions using previously investigated notions of instruction slack. In particular we use static slack to determine if an instruction is deferrable. The static slack of an instruction corresponds to the number of cycles an instruction can be delayed without impacting overall execution time when considering all dynamic paths from that instruction. A significant fraction of the dynamic instruction stream has enough static slack to be deferred by 10 or more cycles on an aggressive execution model. Futhermore, the small amount of register-based communication from deferred instructions to non-deferred instructions makes a deferral-based approach to fetch and execution very attractive.We use a trace cache based microarchitecture to overcome some significant implementation challenges associated with instruction deferral. Overall, instruction deferral boosts the performance of a 4-wide processor by approximately 11% and an 8-wide processor by 6% on eight of the SPEC2000 integer benchmarks.