Overlapping communication and computation by using a hybrid MPI/SMPSs approach

  • Authors:
  • Vladimir Marjanović;Jesús Labarta;Eduard Ayguadé;Mateo Valero

  • Affiliations:
  • Barcelona Supercomputing Center (BSC-CNS);Barcelona Supercomputing Center (BSC-CNS) and Technical University of Catalunya (UPC);Barcelona Supercomputing Center (BSC-CNS) and Technical University of Catalunya (UPC);Barcelona Supercomputing Center (BSC-CNS) and Technical University of Catalunya (UPC)

  • Venue:
  • Proceedings of the 24th ACM International Conference on Supercomputing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Communication overhead is one of the dominant factors affecting performance in high-end computing systems. To reduce the negative impact of communication, programmers overlap communication and computation by using asynchronous communication primitives. This increases code complexity, requiring more development effort and making less readable programs. This paper presents the hybrid use of MPI and SMPSs (SMP superscalar, a task-based shared-memory programming model), allowing the programmer to easily introduce the asynchrony necessary to overlap communication and computation. We also describe implementation issues in the SMPSs run time that support its efficient interoperation with MPI. We demonstrate the hybrid use of MPI/SMPSs with four application kernels (matrix multiply, Jacobi, conjugate gradient and NAS BT) and with the high-performance LINPACK benchmark. For the application kernels, the hybrid MPI/SMPSs versions significantly improve the performance of the pure MPI counterparts. For LINPACK we get close to the asymptotic performance at relatively small problem sizes and still get significant benefits at large problem sizes. In addition, the hybrid MPI/SMPSs approach substantially reduces code complexity and is less sensitive to network bandwidth and operating system noise than the pure MPI versions.