High-level programming language abstractions for advanced and dynamic parallel computations

  • Authors:
  • Steven J. Deitz;Lawrence Snyder

  • Affiliations:
  • University of Washington;University of Washington

  • Venue:
  • High-level programming language abstractions for advanced and dynamic parallel computations
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Call a parallel program p-independent if and only if it always produces the same output on the same input regardless of the number or arrangement of (virtual) processors on which it is run; otherwise call it p-dependent. Most modern parallel programming facilities let scientists easily and unwittingly write p-dependent codes even though the vast majority of programs that scientists want to write are p-independent. This disconnect—between the restricted set of codes that programmers want to write and the unrestricted set of codes that modern programming languages let programmers write—is the source of a great deal of the difficulty associated with parallel programming. This thesis presents a combination of p-independent and p-dependent extensions to ZPL. It argues that by including a set of p-dependent abstractions into a language with a largely p-independent framework, the task of parallel programming is greatly simplified. When the difficulty of debugging a code that produces the correct results on one processor but incorrect results on many processors is confronted, the problem with the code is at least isolated to a few key areas in the program. On the other hand, when a programmer must write per-processor code or take advantage of the processor layout, this is possible. Specifically, this thesis extends ZPL in three directions. First, it introduces abstractions to control processor layout and data distribution. Since these abstractions are first-class and mutable, data redistribution is easy. Second, it introduces abstractions for processor-oriented programming that relax ZPL's programming model and provides a mechanism for writing per-processor codes. Third, it introduces global-view abstractions for user-defined reductions and scans. In addition, this thesis quantitatively and qualitatively evaluates ZPL implementations for three of the NAS benchmark kernels: EP, FT, and IS. ZPL code is shown to be easier to write than MPI code even while the performance is competitive with MPI.