An automated approach to improve communication-computation overlap in clusters

  • Authors:
  • Lewis Fishgold;Anthony Danalis;Lori Pollock;Martin Swany

  • Affiliations:
  • Department of Computer and Information Sciences, University of Delaware, Newark, DE;Department of Computer and Information Sciences, University of Delaware, Newark, DE;Department of Computer and Information Sciences, University of Delaware, Newark, DE;Department of Computer and Information Sciences, University of Delaware, Newark, DE

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Applications that execute on parallel clusters face scalability concerns due to the high communication overhead that is usually associated with such environments. Modern network technologies that support Remote Direct Memory Access (RDMA) can offer true zero copy communication and reduce communication overhead by overlapping it with computation. For this approach to be effective the parallel application using the cluster must be structured in a way that enables communication computation overlapping. Unfortunately, the trade-off between maintainability and performance often leads to a structure that prevents exploiting the potential for communication computation overlapping. This paper describes a source-to-source optimizing transformation that can be performed by an automatic (or semi-automatic) system in order to restructure MPI codes towards maximizing communication-computation overlapping.