Automatic parallelization of divide and conquer algorithms

Authors:
Radu Rugina;Martin Rinard
Affiliations:
Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA;Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA
Venue:
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
1999

Citing 19
Cited 27

Direct parallelization of call statements

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Detecting conflicts between structure accesses

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Introduction to algorithms

Introduction to algorithms
Lazy task creation: a technique for increasing the granularity of parallel programs

LFP '90 Proceedings of the 1990 ACM conference on LISP and functional programming
Analysis of pointers and structures

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Context-sensitive interprocedural points-to analysis in the presence of function pointers

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Efficient context-sensitive pointer analysis for C programs

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Commutativity analysis: a new analysis framework for parallelizing compilers

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C

POPL '96 Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Solving shape-analysis problems in languages with destructive updating

ACM Transactions on Programming Languages and Systems (TOPLAS)
Predicated array data-flow analysis for run-time parallelization

ICS '98 Proceedings of the 12th international conference on Supercomputing
Pointer analysis for multithreaded programs

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Parallelizing Programs with Recursive Data Structures

IEEE Transactions on Parallel and Distributed Systems
An Implementation of Interprocedural Bounded Regular Section Analysis

IEEE Transactions on Parallel and Distributed Systems
Symbolic range propagation

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Symbolic Analysis: A Basis for Parallelization, Optimization, and Scheduling of Programs

Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing

Pointer analysis for multithreaded programs

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Architecture-cognizant divide and conquer algorithms

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Bidwidth analysis with application to silicon compilation

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Symbolic bounds analysis of pointers, array indices, and accessed memory regions

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
ABCD: eliminating array bounds checks on demand

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Pointer analysis for structured parallel programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic Parallelization of Recursive Procedures

International Journal of Parallel Programming
Satin: Efficient Parallel Divide-and-Conquer in Java

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Design-Driven Compilation

CC '01 Proceedings of the 10th International Conference on Compiler Construction
Run-Time Support for the Automatic Parallelization of Java Programs

The Journal of Supercomputing
A Multilevel Computing Architecture for Embedded Multimedia Applications

IEEE Micro
Symbolic bounds analysis of pointers, array indices, and accessed memory regions

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic inversion generates divide-and-conquer parallel programs

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
An adaptive cut-off for task parallelism

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Identification of Heap---Carried Data Dependence Via Explicit Store Heap Models

Languages and Compilers for Parallel Computing
Exploiting Speculative TLP in Recursive Programs by Dynamic Thread Prediction

CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Refactoring sequential Java code for concurrency via concurrent libraries

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
A static heap analysis for shape and connectivity: unified memory analysis: the base framework

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Efficient context-sensitive shape analysis with graph based heap models

CC'08/ETAPS'08 Proceedings of the Joint European Conferences on Theory and Practice of Software 17th international conference on Compiler construction
Dynamic parallelization of recursive code: part 1: managing control flow interactions with the continuator

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Recursion-driven parallel code generation for multi-core platforms

Proceedings of the Conference on Design, Automation and Test in Europe
Automatic verification of determinism for structured parallel programs

SAS'10 Proceedings of the 17th international conference on Static analysis
Programmable data dependencies and placements

DAMP '12 Proceedings of the 7th workshop on Declarative aspects and applications of multicore programming
Program parallelization using synchronized pipelining

LOPSTR'09 Proceedings of the 19th international conference on Logic-Based Program Synthesis and Transformation
JuliusC: a practical approach for the analysis of divide-and-conquer algorithms

LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
HydraVM: extracting parallelism from legacy sequential code using STM

HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
Fine grained parallelism in recursive function calls

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Divide and conquer algorithms are a good match for modern parallel machines: they tend to have large amounts of inherent parallelism and they work well with caches and deep memory hierarchies. But these algorithms pose challenging problems for parallelizing compilers. They are usually coded as recursive procedures and often use pointers into dynamically allocated memory blocks and pointer arithmetic. All of these features are incompatible with the analysis algorithms in traditional parallelizing compilers.This paper presents the design and implementation of a compiler that is designed to parallelize divide and conquer algorithms whose subproblems access disjoint regions of dynamically allocated arrays. The foundation of the compiler is a flow-sensitive, context-sensitive, and interprocedural pointer analysis algorithm. A range of symbolic analysis algorithms build on the pointer analysis information to extract symbolic bounds for the memory regions accessed by (potentially recursive) procedures that use pointers and pointer arithmetic. The symbolic bounds information allows the compiler to find procedure calls that can execute in parallel without violating the data dependences. The compiler generates code that executes these calls in parallel. We have used the compiler to parallelize several programs that use divide and conquer algorithms. Our results show that the programs perform well and exhibit good speedup.