Non-data-communication Overheads in MPI: Analysis on Blue Gene/P

  • Authors:
  • Pavan Balaji;Anthony Chan;William Gropp;Rajeev Thakur;Ewing Lusk

  • Affiliations:
  • Math. and Comp. Sci. Div., Argonne Nat. Lab., Argonne, USA IL 60439;Dept. of Astronomy and Astrophysics, Univ. of Chicago, Chicago, IL 60637;Dept. of Computer Science, Univ. of Illinois, Urbana, USA IL, 61801;Math. and Comp. Sci. Div., Argonne Nat. Lab., Argonne, USA IL 60439;Math. and Comp. Sci. Div., Argonne Nat. Lab., Argonne, USA IL 60439

  • Venue:
  • Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern HEC systems, such as Blue Gene/P, rely on achieving high-performance by using the parallelism of a massive number of low-frequency/low-power processing cores. This means that the local pre- and post-communication processing required by the MPI stack might not be very fast, owing to the slow processing cores. Similarly, small amounts of serialization within the MPI stack that were acceptableon small/medium systems can be brutal on massively parallel systems. In this paper, we study different non-data-communication overheads within the MPI implementation on the IBM Blue Gene/P system.