Future Execution: A Hardware Prefetching Technique for Chip Multiprocessors

  • Authors:
  • Ilya Ganusov;Martin Burtscher

  • Affiliations:
  • Computer Systems Laboratory Cornell University;Computer Systems Laboratory Cornell University

  • Venue:
  • Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a new hardware technique for using one core of a CMP to prefetch data for a thread running on another core. Our approach simply executes a copy of all non-control instructions in the prefetching core after they have executed in the primary core. On the way to the second core, each instructionýs output is replaced by a prediction of the likely output that the nth future instance of this instruction will produce. Speculatively executing the resulting instruction stream on the second core issues load requests that the main program will probably reference in the future. Unlike previously proposed thread-based prefetching approaches, our technique does not need any thread spawning points, features an adjustable lookahead distance, does not require complicated analyzers to extract prefetching threads, is recovery-free, and necessitates no storage for the prefetching threads. We demonstrate that for the SPECcpu2000 benchmark suite, our mechanismsignificantly increases the prefetching coverage and improves the primary coreýs performance by 10% on average over a baseline that already includes an aggressive hardware stream prefetcher. We further show that our approach works well in combination with runahead execution.