Speculative Data-Driven Multithreading

  • Authors:
  • Amir Roth;Gurindar S. Sohi

  • Affiliations:
  • -;-

  • Venue:
  • HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
  • Year:
  • 2001

Quantified Score

Hi-index 0.03

Visualization

Abstract

Abstract: Mispredicted branches and loads that miss in the cache cause the majority of retirement stalls experienced by sequential processors; we call these critical instructions.Despite their importance, a sequential processor has difficulty prioritizing critical computations (computations of critical instructions), because it must fetch all computations sequentially,regardless of their contribution to performance. Speculative data-driven multithreading (DDMT) is a general-purpose mechanism for overcoming this limitation.In DDMT,critical computations are annotated so that they can execute standalone. When the processor predicts an upcoming instance of a critical instruction, it microarchitecturally forks a copy of its computation as a new kind of speculative thread: a data-driven thread (DDT). The DDT executes in parallel with the main program thread, but typically generates the critical result much faster since it fetches and executes only the critical computation and not the whole program. A DDT "pre-executes" critical computation and effectively "consumes" its latency on behalf of the main thread. A DDMT component called integration incorporates results computed in DDTs directly into the main thread, sparing it from having to repeat the work.We simulate an implementation of DDMT on top of a simultaneous multithreading (SMT)processor and use program profiles to create DDTs and annotate them into the executable. Our experiments show that DDMT pre-execution of critical loads and branches can improve performance significantly.