An automated approach to improve communication-computation overlap in clusters

Authors:
Lewis Fishgold;Anthony Danalis;Lori Pollock;Martin Swany
Affiliations:
Department of Computer and Information Sciences, University of Delaware, Newark, DE;Department of Computer and Information Sciences, University of Delaware, Newark, DE;Department of Computer and Information Sciences, University of Delaware, Newark, DE;Department of Computer and Information Sciences, University of Delaware, Newark, DE
Venue:
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Year:
2006

Citing 12
Cited 2

The Omega test: a fast and practical integer programming algorithm for dependence analysis

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiler optimizations for Fortran D on MIMD distributed-memory machines

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Co-array Fortran for parallel programming

ACM SIGPLAN Fortran Forum
Efficient and precise array access analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
Algorithms for Supporting Compiled Communication

IEEE Transactions on Parallel and Distributed Systems
the NESTOR Library: A Tool for Implementing FORTRAN Source Transformations

HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Automatic Parallelization by Pattern-Matching

Proceedings of the Second International ACPC Conference on Parallel Computation
CC--MPI: a compiled communication capable MPI prototype for ethernet switched clusters

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
MPI: A Message-Passing Interface Standard

MPI: A Message-Passing Interface Standard
Automatic loop interchange

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Transformations to Parallel Codes for Communication-Computation Overlap

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing

Transformations to Parallel Codes for Communication-Computation Overlap

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A speculative and adaptive MPI rendezvous protocol over RDMA-enabled interconnects

International Journal of Parallel Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Applications that execute on parallel clusters face scalability concerns due to the high communication overhead that is usually associated with such environments. Modern network technologies that support Remote Direct Memory Access (RDMA) can offer true zero copy communication and reduce communication overhead by overlapping it with computation. For this approach to be effective the parallel application using the cluster must be structured in a way that enables communication computation overlapping. Unfortunately, the trade-off between maintainability and performance often leads to a structure that prevents exploiting the potential for communication computation overlapping. This paper describes a source-to-source optimizing transformation that can be performed by an automatic (or semi-automatic) system in order to restructure MPI codes towards maximizing communication-computation overlapping.