Automatic Parallelization of Recursive Procedures

Authors:
Manish Gupta;Sayak Mukhopadhyay;Navin Sinha
Affiliations:
IBM T. J. Watson Research Center, Yorktown Heights, New York 10598. mgupta@us.ibm.com;Mobius Management Systems, Rye, New York 10580. smukhopa@mobius.com;IBM Global Services India, Bangalore 560017, India. r1navink@in.ibm.com
Venue:
International Journal of Parallel Programming
Year:
2000

Citing 30
Cited 2

MULTILISP: a language for concurrent symbolic computation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Interprocedural dependence analysis and parallelization

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Direct parallelization of call statements

SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Analysis of interprocedural side effects in a parallel programming environment

Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Efficient interprocedural analysis for program parallelization and restructuring

PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Restructuring Lisp programs for concurrent execution

PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Introduction to algorithms

Introduction to algorithms
Low-overhead scheduling of nested parallelism

IBM Journal of Research and Development
Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Context-sensitive interprocedural points-to analysis in the presence of function pointers

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Data structures and program design (3rd ed.)

Data structures and program design (3rd ed.)
Beyond induction variables: detecting and classifying sequences using a demand-driven SSA form

ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient context-sensitive pointer analysis for C programs

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Symbolic array dataflow analysis for array privatization and program parallelization

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Gated SSA-based demand-driven symbolic analysis for parallelizing compilers

ICS '95 Proceedings of the 9th international conference on Supercomputing
Commutativity analysis: a new analysis framework for parallelizing compilers

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
A Unified Framework for Optimizing Communication in Data-Parallel Programs

IEEE Transactions on Parallel and Distributed Systems
A Framework for Exploiting Task and Data Parallelism on Distributed Memory Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Recursion leads to automatic variable blocking for dense linear-algebra algorithms

IBM Journal of Research and Development
Run-time parallelization: its time has come

Parallel Computing - Special issues on languages and compilers for parallel computers
Automatic parallelization of divide and conquer algorithms

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Evaluation of predicated array data-flow analysis for automatic parallelization

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Techniques for speculative run-time parallelization of loops

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Parallelizing Programs with Recursive Data Structures

IEEE Transactions on Parallel and Distributed Systems
An Implementation of Interprocedural Bounded Regular Section Analysis

IEEE Transactions on Parallel and Distributed Systems
Automatic Extraction of Functional Parallelism from Ordinary Programs

IEEE Transactions on Parallel and Distributed Systems
Demand-Driven, Symbolic Range Propagation

LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
New Serial and Parallel Recursive QR Factorization Algorithms for SMP Systems

PARA '98 Proceedings of the 4th International Workshop on Applied Parallel Computing, Large Scale Scientific and Industrial Problems

An extended ANSI C for processors with a multimedia extension

International Journal of Parallel Programming
Efficient and effective array bound checking

ACM Transactions on Programming Languages and Systems (TOPLAS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parallelizing compilers have traditionally focussed mainly on parallelizing loops. This paper presents a new framework for automatically parallelizing recursive procedures that typically appear in divide-and-conquer algorithms. We present compile-time analysis, using powerful, symbolic array section analysis, to detect the independence of multiple recursive calls in a procedure. This allows exploitation of a scalable form of nested parallelism, where each parallel task can further spawn off parallel work in subsequent recursive calls. We describe a runtime system which efficiently supports this kind of nested parallelism without unnecessarily blocking tasks. We have implemented this framework in a parallelizing compiler, which is able to automatically parallelize programs like quicksort and mergesort, written in C. For cases where even the advanced compile-time analysis we describe is not able to prove the independence of procedure calls, we propose novel techniques for speculative runtime parallelization, which are more efficient and powerful in this context than analogous techniques proposed previously for speculatively parallelizing loops. Our experimental results on an IBM G30 SMP machine show good speedups obtained by following our approach.