Run-time parallelization: A framework for parallel computation

Authors:
Lawrence Rauchwerger
Affiliations:
-
Venue:
Run-time parallelization: A framework for parallel computation
Year:
1995

Citing 0
Cited 3

SmartApps: An Application Centric Approach to High Performance Computing: Compiler-Assisted Software and Hardware Support for Reduction Operations

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Principles of Speculative Run-Time Parallelization

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Run-Time Parallelization Optimization Techniques

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of parallelizing, or restructuring, compilers is to detect and exploit parallelism in sequential programs written in conventional languages. Current parallelizing compilers do a reasonable job of extracting parallelism from programs with regular, statically analyzable access patterns. However, if the memory access pattern of the program is input data dependent, then static data dependence analysis and consequently parallelization is impossible. Moreover, in this case the compiler cannot apply privatization and reduction parallelization, the transformations that have been proven to be the most effective in removing data dependences and increasing the amount of exploitable parallelism in the program. Typical examples of irregular, dynamic applications are complex simulations such as SPICE for circuit simulation, DYNA-3D for structural mechanics modeling, DMOL for quantum mechanical simulation of molecules, and CHARMM for molecular dynamics simulation of organic systems. Therefore, since irregular programs represent a large and important fraction of applications, an automatable framework for run-time parallelization is needed to complement existing and future static compiler techniques. In this thesis we present several original techniques that together sketch how automatic compilation can go beyond statically analyzable codes. The methods described are {\em fundamentally} efficient, scalable and general, i.e., their characteristics are not based on heuristics with a wide performance distribution across their input domain but are algorithms that can be analytically proven to produce speedups given the necessary resources and available parallelism. We introduce the idea of testing only for full parallelism in the presence of run-time transformations rather than computing an execution schedule. We introduce the aggressive strategy of speculative parallel execution (scalable to any parallel system -- from micros to multiprocessors). We describe a new technique for analyzing and scheduling loops which are only partially parallel. Additionally we present a framework for the parallelization of loops which contain recurrences and have an unknown iteration space. We believe that within the domain of automatic parallelization the true importance of this work is in breaking the barrier at which automatic parallelization had stopped: regular, well--behaved programs. We also attempt to convey a few general ideas and dispel some older ones. Namely, optimizing at run-time implies overhead but, can actually reduce overall execution time through better exploitation of resources. Speculating about optimizations, more specifically parallelism, may be an attractive and more generally applicable alternative to the ''inspect first and execute later'' strategy. Thus we view this thesis as the first step into the full integration of these techniques into commercial parallelizing compilers.