Logical inference techniques for loop parallelization

Authors:
Cosmin E. Oancea;Lawrence Rauchwerger
Affiliations:
University of Copenhagen, Copenhagen, Denmark;Texas A&M University, College Station, TX, USA
Venue:
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Year:
2012

Citing 25
Cited 3

A scalable method for run-time loop parallelization

International Journal of Parallel Programming
Using integer sets for data-parallel program analysis and optimization

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Efficient Symbolic Analysis for Parallelizing Compilers and Performance Estimators

The Journal of Supercomputing
Constraint-based array dependence analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization

IEEE Transactions on Parallel and Distributed Systems
Evaluation of predicated array data-flow analysis for automatic parallelization

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
Efficient and precise array access analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
The range test: a dependence test for symbolic, non-linear expressions

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Unified Interprocedural Parallelism Detection

International Journal of Parallel Programming
Parallel Programming with Polaris

Computer
Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs

IEEE Transactions on Parallel and Distributed Systems
Demand-Driven Interprocedural Array Property Analysis

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Demand-Driven, Symbolic Range Propagation

LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Analysis of Irregular Single-Indexed Array Accesses and Its Applications in Compiler Optimizations

CC '00 Proceedings of the 9th International Conference on Compiler Construction
Compiler Optimization of Implicit Reductions for Distributed Memory Multiprocessors

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Speedup of ordinary programs

Speedup of ordinary programs
A unified framework for nonlinear dependence testing and symbolic analysis

Proceedings of the 18th annual international conference on Supercomputing
Hybrid analysis: static & dynamic memory reference analysis

International Journal of Parallel Programming
Interprocedural parallelization analysis in SUIF

ACM Transactions on Programming Languages and Systems (TOPLAS)
Parameterized tiled loops for free

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Sensitivity analysis for automatic parallelization on multi-cores

Proceedings of the 21st annual international conference on Supercomputing
Application of Automatic Parallelization to Modern Challenges of Scientific Computing Industries

ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Exploiting Parallelism with Dependence-Aware Scheduling

PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Loop transformations: convexity, pruning and optimization

Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages

Financial software on GPUs: between Haskell and Fortran

Proceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing
Runtime dependency analysis for loop pipelining in high-level synthesis

Proceedings of the 50th Annual Design Automation Conference
A T2 graph-reduction approach to fusion

Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a fully automatic approach to loop parallelization that integrates the use of static and run-time analysis and thus overcomes many known difficulties such as nonlinear and indirect array indexing and complex control flow. Our hybrid analysis framework validates the parallelization transformation by verifying the independence of the loop's memory references. To this end it represents array references using the USR (uniform set representation) language and expresses the independence condition as an equation, S=0, where S is a set expression representing array indexes. Using a language instead of an array-abstraction representation for S results in a smaller number of conservative approximations but exhibits a potentially-high runtime cost. To alleviate this cost we introduce a language translation F from the USR set-expression language to an equally rich language of predicates (F(S) == S = 0). Loop parallelization is then validated using a novel logic inference algorithm that factorizes the obtained complex predicates (F(S)) into a sequence of sufficient independence conditions that are evaluated first statically and, when needed, dynamically, in increasing order of their estimated complexities. We evaluate our automated solution on 26 benchmarks from PERFECT-Club and SPEC suites and show that our approach is effective in parallelizing large, complex loops and obtains much better full program speedups than the Intel and IBM Fortran compilers.