High-level programming language abstractions for advanced and dynamic parallel computations

Authors:
Steven J. Deitz;Lawrence Snyder
Affiliations:
University of Washington;University of Washington
Venue:
High-level programming language abstractions for advanced and dynamic parallel computations
Year:
2005

Citing 0
Cited 5

Global-view abstractions for user-defined reductions and scans

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
The design and development of ZPL

Proceedings of the third ACM SIGPLAN conference on History of programming languages
Parallel Programmability and the Chapel Language

International Journal of High Performance Computing Applications
Tile Reduction: The First Step towards Tile Aware Parallelization in OpenMP

IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
User-defined distributions and layouts in chapel: philosophy and framework

HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism

Quantified Score

Hi-index	0.00

Visualization

Abstract

Call a parallel program p-independent if and only if it always produces the same output on the same input regardless of the number or arrangement of (virtual) processors on which it is run; otherwise call it p-dependent. Most modern parallel programming facilities let scientists easily and unwittingly write p-dependent codes even though the vast majority of programs that scientists want to write are p-independent. This disconnect—between the restricted set of codes that programmers want to write and the unrestricted set of codes that modern programming languages let programmers write—is the source of a great deal of the difficulty associated with parallel programming. This thesis presents a combination of p-independent and p-dependent extensions to ZPL. It argues that by including a set of p-dependent abstractions into a language with a largely p-independent framework, the task of parallel programming is greatly simplified. When the difficulty of debugging a code that produces the correct results on one processor but incorrect results on many processors is confronted, the problem with the code is at least isolated to a few key areas in the program. On the other hand, when a programmer must write per-processor code or take advantage of the processor layout, this is possible. Specifically, this thesis extends ZPL in three directions. First, it introduces abstractions to control processor layout and data distribution. Since these abstractions are first-class and mutable, data redistribution is easy. Second, it introduces abstractions for processor-oriented programming that relax ZPL's programming model and provides a mechanism for writing per-processor codes. Third, it introduces global-view abstractions for user-defined reductions and scans. In addition, this thesis quantitatively and qualitatively evaluates ZPL implementations for three of the NAS benchmark kernels: EP, FT, and IS. ZPL code is shown to be easier to write than MPI code even while the performance is competitive with MPI.