Sensitivity analysis for automatic parallelization on multi-cores

Authors:
Silvius Rus;Maikel Pennings;Lawrence Rauchwerger
Affiliations:
Texas A&M University, College Station, TX;Texas A&M University, College Station, TX;Texas A&M University, College Station, TX
Venue:
Proceedings of the 21st annual international conference on Supercomputing
Year:
2007

Citing 28
Cited 14

Run-Time Parallelization and Scheduling of Loops

IEEE Transactions on Computers
Efficient and exact data dependence analysis

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Practical dependence testing

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
The Omega test: a fast and practical integer programming algorithm for dependence analysis

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Improving the performance of runtime parallelization

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Interprocedural partial redundancy elimination and its application to distributed memory compilation

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Symbolic analysis for parallelizing compilers

ACM Transactions on Programming Languages and Systems (TOPLAS)
Predicated array data-flow analysis for run-time parallelization

ICS '98 Proceedings of the 12th international conference on Supercomputing
Evaluation of predicated array data-flow analysis for automatic parallelization

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic parallelization of recursive procedures

International Journal of Parallel Programming - Special issue on parallel architectures and compilation techniques
Efficient and precise array access analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
Dependence Analysis for Supercomputing

Dependence Analysis for Supercomputing
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
An efficient algorithm for the run-time parallelization of DOACROSS loops

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
The range test: a dependence test for symbolic, non-linear expressions

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Parallel Programming with Polaris

Computer
The I Test: An Improved Dependence Test for Automatic Parallelization and Vectorization

IEEE Transactions on Parallel and Distributed Systems
The Power Test for Data Dependence

IEEE Transactions on Parallel and Distributed Systems
Exact versus Approximate Array Region Analyses

LCPC '96 Proceedings of the 9th International Workshop on Languages and Compilers for Parallel Computing
An Exact Method for Analysis of Value-based Array Data Dependences

Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Interprocedural parallelization using memory classification analysis

Interprocedural parallelization using memory classification analysis
A unified framework for nonlinear dependence testing and symbolic analysis

Proceedings of the 18th annual international conference on Supercomputing
Hybrid analysis: static & dynamic memory reference analysis

International Journal of Parallel Programming
The Value Evolution Graph and its Use in Memory Reference Analysis

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Interprocedural parallelization analysis in SUIF

ACM Transactions on Programming Languages and Systems (TOPLAS)
Induction variable analysis without idiom recognition: beyond monotonicity

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing

Implementation of Sensitivity Analysis for Automatic Parallelization

Languages and Compilers for Parallel Computing
Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
A lightweight in-place implementation for software thread-level speculation

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
The Paralax infrastructure: automatic parallelization with a helping hand

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
An automatic parallelization framework for algebraic computation systems

Proceedings of the 36th international symposium on Symbolic and algebraic computation
Programmer-assisted automatic parallelization

Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
Cooperative parallelization

Proceedings of the International Conference on Computer-Aided Design
Using ownership to reason about inherent parallelism in object-oriented programs

CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
The polyhedral model is more widely applicable than you think

CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Logical inference techniques for loop parallelization

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Code generation for parallel execution of a class of irregular loops on distributed memory systems

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Parallelizing Sequential Programs with Statistical Accuracy Tests

ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Probabilistic Embedded Computing
Integrating profile-driven parallelism detection and machine-learning-based mapping

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sensitivity Analysis (SA) is a novel compiler technique that complements, and integrates with, static automatic parallelization analysis for the cases when relevant program behavior is input sensitive. In this paper we show how SA can extract all the input dependent, statically unavailable, conditions for which loops can be dynamically parallelized. SA generates a sequence of sufficient conditions which, when evaluated dynamically in order of their complexity, can each validate the dynamic parallel execution of the corresponding loop. For example, SA can first attempt to validate parallelization by checking simple conditions related to loop bounds. If such simple conditions cannot be met, then validating dynamic parallelization may require evaluating conditions related to the entire memory reference trace of a loop, thus decreasing the benefits of parallel execution. We have implemented Sensitivity Analysis in the Polaris compiler and evaluated its performance using 22 industry standard benchmark codes running on two multicore systems. In most cases we have obtained speedups superior to the Intel Ifort compiler because with SA we could complement static analysis with minimum cost dynamic analysis and extract most of the available coarse grained parallelism.