Static Single Assignment Form for Message-Passing Programs

Authors:
Dhruva R. Chakrabarti;Prithviraj Banerjee
Affiliations:
Current address: Hewlett-Packard Company, California. dhruva@cup.hp.com;Center for Parallel and Distributed Computing, ECE Dept., Technological Institute, Northwestern University, 2145 Sheridan Road, Evanston, Illinois 60208-3118. banerjee@ece.nwu.edu
Venue:
International Journal of Parallel Programming
Year:
2001

Citing 27
Cited 3

Principles of runtime support for parallel processors

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Detecting equality of variables in programs

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Global value numbers and redundant computations

POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Run-time scheduling and execution of loops on message passing machines

Journal of Parallel and Distributed Computing - Special issue: algorithms for hypercube computers
Efficiently computing static single assignment form and the control dependence graph

ACM Transactions on Programming Languages and Systems (TOPLAS)
Design and evaluation of a compiler algorithm for prefetching

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Communication optimization and code generation for distributed memory machines

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Introduction to parallel computing: design and analysis of algorithms

Introduction to parallel computing: design and analysis of algorithms
Static single assignment for explicitly parallel programs

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
GIVE-N-TAKE—a balanced code placement framework

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Global code motion/global value numbering

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Using MPI: portable parallel programming with the message-passing interface

Using MPI: portable parallel programming with the message-passing interface
A manual for the CHAOS runtime library

A manual for the CHAOS runtime library
Global communication analysis and optimization

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Compiler and run-time support for irregular computations

Compiler and run-time support for irregular computations
A Unified Framework for Optimizing Communication in Data-Parallel Programs

IEEE Transactions on Parallel and Distributed Systems
Array SSA form and its use in parallelization

POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
An efficient uniform run-time scheme for mixed regular-irregular applications

ICS '98 Proceedings of the 12th international conference on Supercomputing
Basic compiler algorithms for parallel programs

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Constant propagation with conditional branches

POPL '85 Proceedings of the 12th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Code motion of control structures in high-level languages

POPL '86 Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
The Paradigm Compiler for Distributed-Memory Multicomputers

Computer
An Implementation of Interprocedural Bounded Regular Section Analysis

IEEE Transactions on Parallel and Distributed Systems
Concurrent SSA Form in the Presence of Mutual Exclusion

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Combining dependence and data-flow analyses to optimize communication

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Concurrent Static Single Assignment Form and Constant Propagation for Explicitly Parallel Programs

LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing

Global optimization techniques for automatic parallelization of hybrid applications

ICS '01 Proceedings of the 15th international conference on Supercomputing
Region array SSA

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Scalable array SSA and array data flow analysis

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper presents a novel scheme for maintaining accurate information about distributed data in message-passing programs. We describe static single assignment (SSA) based algorithms to build up an intermediate representation of a sequential program while targeting code generation for distributed memory machines employing the single program multiple data (SPMD) model of programming. This SSA-based intermediate representation helps in a variety of optimizations performed by our automatic parallelizing compiler, PARADIGM, which generates message passing programs and targets distributed memory machines. In this paper, we concentrate on the semantics and implementation of this SSA-form for message-passing programs while giving some examples of the kind of optimizations they enable. We describe in detail the need for various kinds of merge functions to maintain the single assignment property of distributed data. We give algorithms for placement and semantics of these merge functions and show how the requirements are substantially different owing to the presence of distributed data and arbitrary array addressing functions. This scheme has been incorporated in our compiler framework which can use uniform methods to compile, parallelize, and optimize a sequential program irrespective of the subscripts used in array addressing functions. Experimental results for a number of benchmarks on an IBM SP-2 show a significant improvement in the total runtimes owing to some of the optimizations enabled by the SSA-based intermediate representation. We have observed up to around 10–25% reduction in total runtimes in our SSA-based schemes compared to non-SSA-based schemes on 16 processors.