Compiler orchestrated prefetching via speculation and predication

  • Authors:
  • Rodric M. Rabbah;Hariharan Sandanagobalane;Mongkol Ekpanyapong;Weng-Fai Wong

  • Affiliations:
  • Massachusetts Institute of Technology;National University of Singapore;Georgia Institute of Technology;National University of Singapore

  • Venue:
  • ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a compiler orchestrated prefetching system as a unified framework geared toward ameliorating the gap between processing speeds and memory access latencies. We focus the scope of the optimization on specific subsets of the program dependence graph that succinctly characterize the memory access pattern of both regular array-based applications and irregular pointer-intensive programs. We illustrate how program embedded precomputation via speculative execution can accurately predict and effectively prefetch future memory references with negligible overhead. The proposed techniques reduce the total running time of seven SPEC benchmarks and two OLDEN benchmarks by 27% on an Itanium 2 processor. The improvements are in addition to several state-of-the-art optimizations including software pipelining and data prefetching. In addition, we use cycle-accurate simulations to identify important and lightweight architectural innovations that further mitigate the memory system bottleneck. In particular, we focus on the notoriously challenging class of pointer-chasing applications, and demonstrate how they may benefit from a novel scheme of it sentineled prefetching. Our results for twelve SPEC benchmarks demonstrate that 45% of the processor stalls that are caused by the memory system are avoidable. The techniques in this paper can effectively mask long memory latencies with little instruction overhead, and can readily contribute to the performance of processors today.