Implementation of Sensitivity Analysis for Automatic Parallelization

Authors:
Silvius Rus;Maikel Pennings;Lawrence Rauchwerger
Affiliations:
Parasol Lab, Department of Computer Science, Texas A&M University,;Parasol Lab, Department of Computer Science, Texas A&M University,;Parasol Lab, Department of Computer Science, Texas A&M University,
Venue:
Languages and Compilers for Parallel Computing
Year:
2008

Citing 15
Cited 0

Run-Time Parallelization and Scheduling of Loops

IEEE Transactions on Computers
The Omega test: a fast and practical integer programming algorithm for dependence analysis

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Improving the performance of runtime parallelization

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Interprocedural partial redundancy elimination and its application to distributed memory compilation

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Predicated array data-flow analysis for run-time parallelization

ICS '98 Proceedings of the 12th international conference on Supercomputing
Evaluation of predicated array data-flow analysis for automatic parallelization

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
An efficient algorithm for the run-time parallelization of DOACROSS loops

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Principles of Speculative Run-Time Parallelization

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Interprocedural parallelization using memory classification analysis

Interprocedural parallelization using memory classification analysis
Hybrid analysis: static & dynamic memory reference analysis

International Journal of Parallel Programming
The Value Evolution Graph and its Use in Memory Reference Analysis

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Sensitivity analysis for automatic parallelization on multi-cores

Proceedings of the 21st annual international conference on Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sensitivity Analysis (SA) is a novel compiler technique that complements, and integrates with, static automatic parallelization analysis for the cases when program behavior is input sensitive. SA can extract all the input dependent, statically unavailable, conditions for which loops can be dynamically parallelized. SA generates a sequence of sufficient conditions which, when evaluated dynamically in order of their complexity, can each validate the dynamic parallel execution of the corresponding loop. While SA's principles are fairly simple, implementing it in a real compiler and obtaining good experimental results on benchmark codes is a difficult task. In this paper we present some of the most important implementation issues that we had to overcome in order to achieve a fairly successful automatic parallelizer. We present techniques related to validating dependence removing transformations, e.g., privatization or pushback parallelization, and static and dynamic evaluation of complex conditions for loop parallelization. We concern ourselves with multi-version and parallel code generation as well as the use of speculative parallelization when other, less costly options fail. We present a summary table of the contributions of our techniques to the successful parallelization of 22 industry benchmark codes. We also report speedups and parallel coverage of these codes on two multicore based systems and compare them to results obtained by the Ifort compiler.