Accelerating and Adapting Precomputation Threads for Effcient Prefetching

  • Authors:
  • Weifeng Zhang;Dean M. Tullsen;Brad Calder

  • Affiliations:
  • Department of Computer Science and Engineering, University of California, San Diego;Department of Computer Science and Engineering, University of California, San Diego;Department of Computer Science and Engineering, University of California, San Diego

  • Venue:
  • HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Speculative precomputation enables effective cache prefetching for even irregular memory access behavior, by using an alternate thread on a multithreaded or multi-core architecture. This paper describes a system that constructs and runs precomputation based prefetching threads via event-driven dynamic optimization. Precomputation threads are dynamically constructed by a runtime compiler from the program's frequently executed hot traces, and are adapted to the memory behavior automatically. Both construction and execution of the prefetching threads happen in another thread, imposing little overhead on the main thread. This paper also presents several techniques to accelerate the precomputation threads, including colocation of p-threads with hot traces, dynamic stride prediction, and automatic adptation of runahead and jumpstart distance. The adaptive prefetching achieves 42% speedup, a 17% improvement over existing p-thread prefetching schemes.