Realizing high IPC through a scalable memory-latency tolerant multipath microarchitecture

  • Authors:
  • D. Morano;A. Khalafi;D. R. Kaeli;A. K. Uht

  • Affiliations:
  • Northeastern University;Northeastern University;Northeastern University;University of Rhode Island

  • Venue:
  • ACM SIGARCH Computer Architecture News
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

A microarchitecture is described that achieves high performance on conventional single-threaded program codes without compiler assistance. To obtain high instructions per clock (IPC) for inherently sequential (e.g., SpecInt-2000 programs), a large number of instructions must be in flight simultaneously. However, several problems are associated with such microarchitectures, including scalability, issues related to control flow, and memory latency.Our design investigates how to utilize a large mesh of processing elements in order to execute a singlethreaded program. We present a basic overview of our microarchitecture and discuss how it addresses scalability as we attempt to execute many instructions in parallel. The microarchitecture makes use of control and value speculative execution, multipath execution, and a high degree of out-of-order execution to help extract instruction level parallelism. Execution-time predication and time-tags for operands are used for maintaining program order. We provide simulation results for several geometries of our microarchitecture illustrating a range of design tradeoffs. Results are also presented that show the small performance impact over a range of memory system latencies.