Aggressive communication optimizations for clusters of workstations

Authors:
Gagan Agrawal
Affiliations:
Department of Computer and Information Sciences, University of Delaware, Newark DE
Venue:
Cluster computing
Year:
2001

Citing 10
Cited 0

Compiling Fortran D for MIMD distributed-memory machines

Communications of the ACM
Interprocedural compilation of Fortran D for MIMD distributed-memory machines

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
GIVE-N-TAKE—a balanced code placement framework

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Compiling Fortran 90D/HPF for distributed memory MIMD computers

Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Interprocedural partial redundancy elimination and its application to distributed memory compilation

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Compiler optimizations for eliminating barrier synchronization

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Interprocedural compilation of irregular applications for distributed memory machines

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Global communication analysis and optimization

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
An Integrated Runtime and Compile-Time Approach for Parallelizing Structured and Block Structured Applications

IEEE Transactions on Parallel and Distributed Systems
A Unified Data-Flow Framework for Optimizing Communication

LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Because of the increasing computational power of workstations and the PCs, the peak processing power of clusters of workstations has been increasing at a rapid pace. However, the sustained performance on a variety of applications lags far behind, because these systems offer lower communication performance. In this paper, we focus on improving the communication performance of the applications run on the clusters through aggressive compiler optimizations. We present a general interprocedural technique for performing communication optimizations across procedure boundaries. Our technique uses the result of local analysis to model the communication as a communication loop, and then performs flow-sensitive interprocedural data-flow analysis to avoid redundant communication, and to perform communication aggregation. Our experimental results and the projected analysis on the clusters shows that aggressive communication optimizations from compilers are very important for system with low communication performance and high computational power.