Profiling for Input Predictable Threads

  • Authors:
  • Affiliations:
  • Venue:
  • ICCD '98 Proceedings of the International Conference on Computer Design
  • Year:
  • 1998

Quantified Score

Hi-index 0.01

Visualization

Abstract

Thread level speculative execution, together with value prediction to break data dependencies between threads, may enable efficient single program execution on a closely coupled chip multiprocessor. This paper describes a method to partition sequential programs such that a minimum number of hard-to-predict data dependencies cross thread boundaries. SPEC95 binaries are partitioned and then executed on an idealized simulator in order to study the limits of this technique. Value prediction is shown to be crucial to exposing Thread Level Parallelism (TLP). Simple value predictors promise modest to large gains if thread boundaries are properly identified. Concurrent execution of fully data independent threads on 8 processors gives a geometric mean speedup of 2.4 for the benchmarks studied.