A general compiler framework for speculative multithreaded processors

  • Authors:
  • Anasua Bhowmik;Manoj Franklin

  • Affiliations:
  • -;-

  • Venue:
  • A general compiler framework for speculative multithreaded processors
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increasing transistor densities offered by semiconductor technology and the desire to reduce the completion time of general purpose programs have led the hardware designers and the compiler writers to develop aggressive techniques for exploiting program parallelism. Speculative multithreading (SpMT) is emerging as an effective mechanism for exploiting thread level parallelism from non-numeric programs. SpMT processors allow the program to execute in the presence of ambiguous control and data dependences and recover when dependence violations are detected at run-time. Proper thread formation is crucial to obtain good speedup in an SpMT system. Most existing compilers for SpMT architectures concentrate on extracting parallelism from only loops. However, general purpose programs spend considerable amount of time outside the loops (or in the loops that cannot be parallelized). Studies have confirmed that a significant amount of parallelism also exists outside loops. In this dissertation we present a compiler framework for partitioning sequential programs into multiple threads for parallel execution in an SpMT system. Beside exploiting loop level parallelism, our compiler extracts parallelism from regions outside loops as well. Our compiler performs extensive program analyses and profiling to capture the actual data dependences and control dependences in programs. We have also developed a simple and fast interprocedural analysis scheme to facilitate the program analysis by SpMT compilers. We have proposed two models to represent the inter-thread data dependences in the program and have used them in our partitioning algorithm. The compiler is implemented on the SUIF-MachSUIF platform and is able to partition large programs, such as the SPEC benchmark programs. We have evaluated the effectiveness of our compiler with a trace driven simulator. Experimental results clearly show the ability of our compiler to extract significant parallelism from both the loops and non-loop regions of the non-numeric programs. Our experimental studies also indicate that hardware supports for data value prediction and out-of-order thread spawning are to achieve better performance.