Dynamic memory instruction bypassing

  • Authors:
  • Daniel Ortega;Mateo Valero;Eduard Ayguadé

  • Affiliations:
  • Barcelona Research Office, Hewlett Packard Laboratories, Barcelona, Spain;Depto. de Arquitectura de Computadores, Universidad Politécnica de Cataluña, Barcelona, Spain;Depto. de Arquitectura de Computadores, Universidad Politécnica de Cataluña, Barcelona, Spain

  • Venue:
  • International Journal of Parallel Programming - Special issue I: The 17th annual international conference on supercomputing (ICS'03)
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reducing the latency of load instructions is among the most crucial aspects to achieve high performance for current and future microarchitectures. Deep pipelining impacts load-to-use latency even for loads that hit in cache. In this paper we present a dynamic mechanism which detects relations between address producing, instructions and the loads that consume these addresses and uses this information to access data before the load is even fetched from the I-Cache. This mechanism is not intended to prefetch from outside the chip but to move data from L1 and L2 silently and ahead of time into the register file, allowing the bypassing of the load instruction (hence the name). An average performance improvement of 22.24% is achieved in the SPE- Cint95 benchmarks.