Single FU bypass networks for high clock rate superscalar processors

  • Authors:
  • Aneesh Aggarwal

  • Affiliations:
  • Department of Electrical and Computer Engineering, Binghamton University, Binghamton, NY

  • Venue:
  • HiPC'04 Proceedings of the 11th international conference on High Performance Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microprocessors depend heavily on broadcast-based bypass networks, to eliminate pipeline hazards arising due to data dependencies However, even though bypassing is logically simple, increasing clock speeds make broadcasting slower and difficult to implement, especially for wide issue and deeply pipelined processors The problem is exacerbated by shrinking feature size, as wire delays become more important than the gate delays. In this paper, we propose Single FU bypass networks for high clock rate superscalar processors where, instead of a fully connected broadcast-based bypass network, results from an FU are forwarded only to itself The new bypass network design is based on the observations that a result produced by an instruction is mostly required by just one other instruction and that the operands of many instructions come from a single other instruction The new bypass network results in significant reduction in the data forwarding latency, while incurring only a small impact (about 2% for most of the SPEC2K benchmarks) on the instructions per cycle (IPC) count However, reduced bypass latency has a high potential for increased clock speeds Single FU bypass networks are also much more scalable than the broadcast-based bypass networks, for more wide and more deeply pipelined future microprocessors.