A quantitative framework for automated pre-execution thread selection

  • Authors:
  • Amir Roth;Gurindar S. Sohi

  • Affiliations:
  • University of Pennsylvania;University of Wisconsin--Madison

  • Venue:
  • Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pre-execution attacks cache misses for which address prediction driven prefetching fails. In pre-execution, copies of cache miss computations are isolated from the main program and launched as separate threads called p-threads whenever the processor anticipates an upcoming miss. P-thread selection is the task of deciding what computations should execute as p-threads and when they should be launched such that total execution time is minimized. It is central to the success of pre-execution.We introduce a framework for automated static p-thread selection, a static p-thread being one whose dynamic instances are repeatedly launched during course of program execution. Our approach is to formalize the problem quantitatively and then apply standard techniques to solve it analytically. The framework has two novel components. The slice tree is a data structure that compactly represents a set of static p-threads and the relationships among them. Aggregate advantage is a formula that uses raw program statistics and computation structure to assign each candidate static p-thread a numeric score based on estimated latency tolerance and overhead aggregated over its expected dynamic executions.We use the framework to select p-threads that cover L2 misses and study its effectiveness under different conditions via detailed simulation. We measure the effect of constraining p-thread length, locally optimizing p-threads, using different program samples as a statistical basis selection, and varying several machine parameters. Our framework responds to these changes in an intuitive way. We also validate that aggregate advantage correctly models actual pre-execution.