Increased Scalability and Power Efficiency by Using Multiple Speed Pipelines

  • Authors:
  • Emil Talpes;Diana Marculescu

  • Affiliations:
  • Carnegie Mellon University;Carnegie Mellon University

  • Venue:
  • Proceedings of the 32nd annual international symposium on Computer Architecture
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the most important problems faced by microarchitecture designers is the poor scalability of some of the current solutions with increased clock frequencies and wider pipelines. As several studies show, internal processor structures scale differently with decreasing device sizes. While in some cases the access latency is determined by the speed of the logic circuitry, for others it is dominated by the interconnect delay. Furthermore, while some stages can be super-pipelined with relatively small performance loss, others must be kept atomic. This paper proposes a possible solution to this problem, avoiding the traditional trade-off between parallelism and clock speed. First, allowing instructions to enter and leave the Issue Window in an asynchronously manner enables faster speeds in the front-end at the expense of small synchronization latencies. Second, using an Execution Cache for storing instructions that are already scheduled allows for bypassing the issue circuitry and thus clocking the execution core at higher frequencies. Combined, these two mechanisms result in a 50% to 60% performance increase for our test microarchitecture, without requiring a completely new scheduling mechanism. Furthermore, the proposed microarchitecture requires significantly less energy, with 30% reduction in a 0.13um or 20% in a 0.06um process technology over the original baseline.