Run-time scheduling and execution of loops on message passing machines
Journal of Parallel and Distributed Computing - Special issue: algorithms for hypercube computers
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Static single assignment for explicitly parallel programs
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
GIVE-N-TAKE—a balanced code placement framework
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Global communication analysis and optimization
PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Compiler and run-time support for irregular computations
Compiler and run-time support for irregular computations
Array SSA form and its use in parallelization
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
An efficient uniform run-time scheme for mixed regular-irregular applications
ICS '98 Proceedings of the 12th international conference on Supercomputing
Combining dependence and data-flow analyses to optimize communication
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Concurrent Static Single Assignment Form and Constant Propagation for Explicitly Parallel Programs
LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing
The Advantages of Instance-Wise Reaching Definition Analyses in Array (S)SA
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Hi-index | 0.00 |
This paper presents a novel scheme for maintaining accurate information about distributed data in message-passing programs. The ability to maintain dynamically the data-to-processor mapping as well as the program contexts at which state changes occur enable a variety of sophisticated optimizations. The algorithms described in this paper are based on the static single assignment (SSA) form of message-passing programs which can be used for performing many of the classical compiler optimizations during automatic parallelization as well as for analyzing user-written message-passing programs. Reaching definition analysis is performed on SSA-structures for determining a suitable communication point. The scheme addresses possible optimizations and shows how appropriate representation of the data structures can substantially reduce the associated overheads. Our scheme uniformly handles arbitrary subscripts in array references and can handle general reducible control flow. Experimental results for a number of benchmarks on an IBM SP-2 show a conspicuous reduction in inter-processor communication as well as a marked improvement in the total run-times. We have observed up to around 10-25% reduction in total run-times in our SSA-based schemes compared to non-SSA-based schemes on 16 processors.