Reducing delay and power consumption of the wakeup logic through instruction packing and tag memoization

  • Authors:
  • Joseph Sharkey;Dmitry Ponomarev;Kanad Ghose;Oguz Ergin

  • Affiliations:
  • Department of Computer Science, State University of New York, Binghamton, NY;Department of Computer Science, State University of New York, Binghamton, NY;Department of Computer Science, State University of New York, Binghamton, NY;Intel Barcelona Research Center, Intel Labs, UPC, Barcelona, Spain

  • Venue:
  • PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dynamic instruction scheduling logic is one of the most critical components of modern superscalar microprocessors, both from the delay and power dissipation standpoints. The delay and energy requirement of driving the result tags across the associatively-addressed issue queue accounts for a significant percentage of the scheduler's overhead and also limits the design scalability. We propose two schemes to reduce the power consumption and the delays of the wakeup logic. Our first scheme – instruction packing – shares the associative part of an issue queue entry between two instructions, each with at most one non-ready source. As a result, the number of entries in the issue queue (and, hence, the length of the tag buses) can be reduced by a factor of two with almost no impact on the IPCs, because most instructions either enter the pipeline with at least one of their source operands ready, or do not make use of two source registers to begin with. Our second scheme – tag memoization – avoids driving the upper portion of the tags, if those bits did not change their values from what was driven on the same tag bus during the most recent broadcast. While instruction packing results in the reduced length of the tag buses, tag memoization reduced the number of tag lines that need to be driven. We evaluate our designs using detailed microarchitectural simulations of the SPEC 2000 benchmarks and the SPICE simulations of the issue queue layouts.