Microarchitecture support for improving the performance of load target prediction

  • Authors:
  • Chung-Ho Chen;Akida Wu

  • Affiliations:
  • Dept. of Electronic Engineering, National Yunlin University of Science & Technology, Yunlin, Taiwan, R.O.C.;Dept. of Electronic Engineering, National Yunlin University of Science & Technology, Yunlin, Taiwan, R.O.C.

  • Venue:
  • MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

Presents a load target prediction scheme that mitigates the impact of load latency for modern microprocessors. The scheme uses a cache-like buffer to provide the base address, offset and operand size at the instruction fetching stage of a pipeline so that a load target address can be computed earlier at the decode stage. With the dynamic use of a load stride, the scheme has achieved a prediction rate that is 15% higher than a previously proposed approach. By providing a 128-entry direct-mapped load-prediction buffer, two adders and two forwarding paths, for a 4-fetch processor the scheme provides an average speedup of 10% to 32% in performance improvement as the data cache latency increases from 2 cycles to 4 cycles. A bit-array design that supports multiple-cast writes and eliminates the associative logic commonly used in base register caching is developed for the prediction scheme.