Globalizing selectively: shared-memory efficiency with address-space separation

  • Authors:
  • Nilesh Mahajan;Uday Pitambare;Arun Chauhan

  • Affiliations:
  • Indiana University, Bloomington, IN;Indiana University, Bloomington, IN;Indiana University, Bloomington, IN

  • Venue:
  • SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

It has become common for MPI-based applications to run on shared-memory machines. However, MPI semantics do not allow leveraging shared memory fully for communication between processes from within the MPI library. This paper presents an approach that combines compiler transformations with a specialized runtime system to achieve zero-copy communication whenever possible by proving certain properties statically and globalizing data selectively by altering the allocation and deallocation of communication buffers. The runtime system provides dynamic optimization, when such proofs are not possible statically, by copying data only when there are write-write or read-write conflicts. We implemented a prototype compiler, using ROSE, and evaluated it on several benchmarks. Our system produces code that performs better than MPI in most cases and no worse than MPI, tuned for shared memory, in all cases.