Improving the performance of runtime parallelization

  • Authors:
  • Shun-Tak Leung;John Zahorjan

  • Affiliations:
  • -;-

  • Venue:
  • PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

When the inter-iteration dependency pattern of the iterations of a loop cannot be determined statically, compile time parallelization of the loop is not possible. In these cases, runtime parallelization [8] is the only alternative. The idea is to transform the loop into two code fragements: the inspector and the executor. When the program is run, the inspector examines the iteration dependencies and constructs a parallel schedule. The executor subsequently uses that schedule to carry out the actual computation in parallel.In this paper, we show how to reduce the overhead of running the inspector through its parallel execution. We describe two related approaches. The first, which emphasizes inspector efficiency, achieves nearly linear speedup relative to a sequential execution of the inspector, but produces a schedule that may be less efficient for the executor. The second technique, which emphasizes executor efficiency, does not in general achieve linear speedup of the inspector, but is guaranteed to produce the best achievable schedule. We present these techniques, show that they are correct, and compare their performance to existing techniques using a set of experiments.Because in this paper we are optimizing inspector time, but leaving the executor unchanged, the techniques we present have most dramatic effect when the inspector must be run for each invocation of the source loop. In a companion paper [3], we explore techniques that build upon those developed here to also improve executor performance.