Circuits for wide-window superscalar processors

  • Authors:
  • Dana S. Henry;Bradley C. Kuszmaul;Gabriel H. Loh;Rahul Sami

  • Affiliations:
  • Yale University, Departments of Computer Science and Electrical Engineering;Yale University, Departments of Computer Science and Electrical Engineering;Yale University, Departments of Computer Science and Electrical Engineering;Yale University, Departments of Computer Science and Electrical Engineering

  • Venue:
  • Proceedings of the 27th annual international symposium on Computer architecture
  • Year:
  • 2000

Quantified Score

Hi-index 0.01

Visualization

Abstract

Our program benchmarks and simulations of novel circuits indicate that large-window processors are feasible. Using our redesigned superscalar components, a large-window processor implemented in today's technology can achieve an increase of 10-60% (geometric mean of 31%) in program speed compared to today's processors. The processor operates at clock speeds comparable to today's processors, but achieves significantly higher ILP.To measure the impact of a large window on clock speed, we design and simulate new implementations of the logic components that most limit the critical path of our large-window processor: the schedule logic and the wake-up logic. We use log-depth cyclic segmented prefix (CSP) circuits to reimplement these components. Our layouts and simulations of critical paths through these circuits indicate that our large-window processor could be clocked at frequencies exceeding 500MHz in today's technology. Our commit logic and rename logic can also run at these speeds.To measure the impact of a large window on ILP, we compare two microarchitectures, the first has a 128-instruction window, an 8-wide fetch unit, and 20-wide issue (four integer, branch, multiply, float, and memory units), whereas the second has a 32-instruction window, and a 4-wide fetch unit and is comparable to today's processors. For each, we simulate different window reuse and bypass policies. Our simulations show that the large-window processor achieves significantly higher IPC. This performance increase comes despite the fact that the large-window processor uses a wrap-around window while the small-window processor uses a compressing window, thus effectively increasing its number of outstanding instructions. Furthermore, the large-window processor sometimes pays an extra clock cycle for bypassing.