Increasing hardware data prefetching performance using the second-level cache

  • Authors:
  • Nathalie Drach;Jean-Luc Béchennec;Olivier Temam

  • Affiliations:
  • LRI, Paris South University, 91405 Orsay Cedex, France;LRI, Paris South University, 91405 Orsay Cedex, France;LRI, Paris South University, 91405 Orsay Cedex, France

  • Venue:
  • Journal of Systems Architecture: the EUROMICRO Journal
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Techniques to reduce or tolerate large memory latencies are critical for achieving high processor performance. Hardware data prefetching is one of the most heavily studied solutions, but it is essentially applied to first-level caches where it can severely disrupt processor behavior by delaying normal cache requests, inducing cache pollution and occupying the heavily used bus to the second-level cache. In this article, we show that applying hardware data prefetching to the second level cache exhibits most of the benefits of first-level cache prefetching with almost none of its drawbacks. Moreover, we outline that second-level hardware data prefetching is particularly well suited to out-of-order (OoO) processors because it can hide the long memory latencies due to second-level cache misses while OoO execution of memory instructions can hide the lower latencies due to first-level cache misses that hit in the second-level cache. Finally, we show that when the full memory system is taken into account, especially bus traffic, first-level cache prefetching can actually degrade overall processor performance while second-level cache prefetching consistently improves overall performance. Our experimental results show that the instructions per cycle of floating-point programs (SPEC95) increases by 20% on a average using second-level cache hardware data prefetching while it decreases by 5% on a average using first-level cache hardware data prefetching.