Matrix scheduler reloaded

  • Authors:
  • Peter G. Sassone;Jeff Rupley, II;Edward Brekelbaum;Gabriel H. Loh;Bryan Black

  • Affiliations:
  • Intel Microarchitecture Research Lab (MRL), Austin, TX;Intel Microarchitecture Research Lab (MRL), Austin, TX;Intel Microarchitecture Research Lab (MRL), Austin, TX;Georgia Inst of Technology, Atlanta, GA;Intel Microarchitecture Research Lab (MRL), Austin, TX

  • Venue:
  • Proceedings of the 34th annual international symposium on Computer architecture
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

From multiprocessor scale-up to cache sizes to the number of reorder-buffer entries, microarchitects wish to reap the benefits of more computing resources while staying within power and latency bounds. This tension is quite evident in schedulers, which need to be large and single-cycle for maximum performance on out-of-order cores. In this work we present two straightforward modifications to a matrix scheduler implementation which greatly strengthen its scalability. Both are based on the simple observation that the wakeup and picker matrices are sparse, even at small sizes; thus small indirection tables can be used to greatly reduce their width and latency. This technique can be used to create quicker iso-performance schedulers (17-58% reduced critical path) or larger iso-timing schedulers (7-26% IPC increase). Importantly, the power and area requirements of the additional hardware are likely offset by the greatly reduced matrix sizes and subsuming the functionality of the power-hungry allocation CAMs.