Combining compile-time and run-time parallelization[1]

Authors:
Sungdo Moon;Byoungro So;Mary W. Hall
Affiliations:
(Correspd.) Information Sciences Inst., University of Southern California, 4676 Admiralty Way, Marina del Rey, CA 90292, USA Tel.&colon/ +1 310 822 1510, ext. 458&semi/ Fax&colon/ +1 310 822 7791& ...;Information Sciences Inst., University of Southern California, 4676 Admiralty Way, Marina del Rey, CA 90292, USA Tel.&colon/ +1 310 822 1510, ext. 458&semi/ Fax&colon/ +1 310 822 7791&semi/ E-mail ...;Information Sciences Inst., University of Southern California, 4676 Admiralty Way, Marina del Rey, CA 90292, USA Tel.&colon/ +1 310 822 1510, ext. 458&semi/ Fax&colon/ +1 310 822 7791&semi/ E-mail ...
Venue:
Scientific Programming
Year:
1999

Citing 28
Cited 0

Constant propagation with conditional branches

ACM Transactions on Programming Languages and Systems (TOPLAS)
Run-Time Parallelization and Scheduling of Loops

IEEE Transactions on Computers
Eliminating false data dependences using the Omega test

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Interprocedural analyses for programming environments

Environments and tools for parallel scientific computing
The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Symbolic analysis for parallelizing compilers

Symbolic analysis for parallelizing compilers
Symbolic array dataflow analysis for array privatization and program parallelization

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Automatic array privatization and demand-driven symbolic analysis

Automatic array privatization and demand-driven symbolic analysis
Interprocedural conditional branch elimination

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Experience with efficient array data flow analysis for array privatization

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallelizing compiler techniques based on linear inequalities

Parallelizing compiler techniques based on linear inequalities
On the Automatic Parallelization of the Perfect Benchmarks®

IEEE Transactions on Parallel and Distributed Systems
Improving data-flow analysis with path profiles

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Predicated array data-flow analysis for run-time parallelization

ICS '98 Proceedings of the 12th international conference on Supercomputing
Measuring the effectiveness of automatic parallelization in SUIF

ICS '98 Proceedings of the 12th international conference on Supercomputing
Evaluation of predicated array data-flow analysis for automatic parallelization

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Qualified data flow problems

POPL '80 Proceedings of the 7th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Systematic design of program analysis frameworks

POPL '79 Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Parallel Programming with Polaris

Computer
Maximizing Multiprocessor Performance with the SUIF Compiler

Computer
Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs

IEEE Transactions on Parallel and Distributed Systems
Extending Typestate Checking Using Conditional Liveness Analysis

IEEE Transactions on Software Engineering
Automatic Array Privatization

Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Expected forms of data flow analyses

Programs as Data Objects, Proceedings of a Workshop
Interprocedural Analysis for Parallelization

LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Symbolic analysis techniques for effective automatic parallelization

Symbolic analysis techniques for effective automatic parallelization
Interprocedural parallelization analysis in SUIF

ACM Transactions on Programming Languages and Systems (TOPLAS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper demonstrates that significant improvements to automatic parallelization technology require that existing systems be extended in two ways: (1) they must combine high-quality compile-time analysis with low-cost run-time testing; and (2) they must take control flow into account during analysis. We support this claim with the results of an experiment that measures the safety of parallelization at run time for loops left unparallelized by the Stanford SUIF compiler’s automatic parallelization system. We present results of measurements on programs from two benchmark suites - \textsc{Specfp95} and \textsc{Nas} sample benchmarks - which identify inherently parallel loops in these programs that are missed by the compiler. We characterize remaining parallelization opportunities, and find that most of the loops require run-time testing, analysis of control flow, or some combination of the two. We present a new compile-time analysis technique that can be used to parallelize most of these remaining loops. This technique is designed to not only improve the results of compile-time parallelization, but also to produce low-cost, directed run-time tests that allow the system to defer binding of parallelization until run-time when safety cannot be proven statically. We call this approach predicated array data-flow analysis. We augment array data-flow analysis, which the compiler uses to identify independent and privatizable arrays, by associating predicates with array data-flow values. Predicated array data-flow analysis allows the compiler to derive “optimistic” data-flow values guarded by predicates; these predicates can be used to derive a run-time test guaranteeing the safety of parallelization. [1]This work has been supported by DARPA Contract DABT63-95-C-0118 and NSF Contract ACI-9721368.