The R-LRPD Test: Speculative Parallelization of Partially Parallel Loops

Authors:
Francis Dang;Hao Yu;Lawrence Rauchwerger
Affiliations:
Texas A&M University;Texas A&M University;Texas A&M University
Venue:
IPDPS '02 Proceedings of the 16th International Symposium on Parallel and Distributed Processing
Year:
2002

Citing 0
Cited 10

SmartApps: An Application Centric Approach to High Performance Computing: Compiler-Assisted Software and Hardware Support for Reduction Operations

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Design Space Exploration of a Software Speculative Parallelization Scheme

IEEE Transactions on Parallel and Distributed Systems
New Scheduling Strategies for Randomized Incremental Algorithms in the Context of Speculative Parallelization

IEEE Transactions on Computers
A lightweight in-place implementation for software thread-level speculation

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Probabilistic program analysis for parallelizing compilers

VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
Fastpath speculative parallelization

LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Speculative separation for privatization and reductions

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Financial software on GPUs: between Haskell and Fortran

Proceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing
Optimizing software runtime systems for speculative parallelization

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Speculative parallelization: eliminating the overhead of failure

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. In our previously proposed framework we have speculatively executed a loop as a doall, and applied a fully parallel data dependence test to determine if it had any cross-processor dependences; If the test failed, then the loop was re-executed serially. While this method exploits doall parallelism well, it can cause slowdowns for loops with even one cross-processor flow dependence because we have to re-execute sequentially. Moreover, the existing, partial parallelism of loops is not exploited. We now propose a generalization of our speculative doall parallelization technique, called the Recursive LRPD test, that can extract and exploit the maximum available parallelism of any loop and that limits potential slowdowns to the overhead of the run-time dependence test itself. In this paper we present the base algorithm and an analysis of the different heuristics for its practical application and a few experimental results on loops from Track, Spice, and FMA3D.