Cost-Effective Compiler Directed Memory Prefetching and Bypassing

  • Authors:
  • Daniel Ortega;Eduard Ayguadé;Jean-Loup Baer;Mateo Valero

  • Affiliations:
  • -;-;-;-

  • Venue:
  • Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ever increasing memory latencies and deeper pipelines push memory farther from the processor. Prefetching techniques aim is to bridge these two gaps by fetching data in advance to both the L1 cache and the register file. Our main contribution in this paper is a hybrid approach to the prefetching problem that combines both software and hardware prefetching in a cost-effective way by needing very little hardware support and impacting minimally the design of the processor pipeline. The prefetcher is built on-top of a static memory instruction bypassing, which is in charge of bringing prefetched values in the register file. In this paper we also present a thorough analysis of the limits of both prefetching and memory instruction bypassing. We also compare our prefetching technique with a prior speculative proposal that attacked the same problem, and we show that at much lower cost, our hybrid solution is better than a realistic implementation of speculative prefetching and bypassing. In average, our hybrid implementation achieves a 13% speed-up improvement over a version with software prefetching in a subset of numerical applications and an average of 43% over aversion with no software prefetching (achieving up to a 102% for specific benchmarks).