Tolerating data access latency with register preloading

  • Authors:
  • William Y. Chen;Scott A. Mahlke;Wen-mei W. Hwu;Tokuzo Kiyohara;Pohua P. Chang

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ICS '92 Proceedings of the 6th international conference on Supercomputing
  • Year:
  • 1992

Quantified Score

Hi-index 0.01

Visualization

Abstract

By exploiting fine grain parallelism, superscalar processors can potentially increase the performance of future supercomputers. However, supercomputers typically have a long access delay to their first level memory which can severely restrict the performance of superscalar processors. Compilers attempt to move load instructions far enough ahead to hide this latency. However, conventional movement of load instructions is limited by data dependence analysis. This paper introduces a simple hardware scheme, referred to as preload register update, to allow the compiler to move load instructions even in the presence of inconclusive data dependence analysis results. Preload register update keeps the load destination registers coherent when load instructions are moved past store instructions that reference the same location. With this addition, superscalar processors can more effectively tolerate longer data access latencies.