Fetch Gating Control through Speculative Instruction Window Weighting

  • Authors:
  • Hans Vandierendonck;André Seznec

  • Affiliations:
  • Department of Electronics and Information Systems/HiPEAC, Ghent University, Gent, Belgium B-9000;IRISA/INRIA/HiPEAC Campus de Beaulieu, Rennes Cedex, France 35042

  • Venue:
  • Transactions on High-Performance Embedded Architectures and Compilers II
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a dynamic reordering superscalar processor, the front-end fetches instructions and places them in the issue queue. Instructions are then issued by the back-end execution core. Till recently, the front-end was designed to maximize performance without considering energy consumption. The front-end fetches instructions as fast as it can until it is stalled by a filled issue queue or some other blocking structure. This approach wastes energy: (i) speculative execution causes many wrong-path instructions to be fetched and executed, and (ii) back-end execution rate is usually less than its peak rate, but front-end structures are dimensioned to sustained peak performance. Dynamically reducing the front-end instruction rate and the active size of front-end structure (e.g. issue queue) is a required performance-energy trade-off. Techniques proposed in the literature attack only one of these effects. In previous work, we have proposed Speculative Instruction Window Weighting (SIWW) [21], a fetch gating technique that allows to address both fetch gating and instruction issue queue dynamic sizing. SIWW computes a global weight on the set of inflight instructions. This weight depends on the number and types of inflight instructions (non-branches, high confidence or low confidence branches, ...). The front-end instruction rate can be continuously adapted based on this weight. This paper extends the analysis of SIWW performed in previous work. It shows that SIWW performs better than previously proposed fetch gating techniques and that SIWW allows to dynamically adapt the size of the active instruction queue.