Performance evaluation of hybrid parallel programming paradigms

  • Authors:
  • Achal Prabhakar;Vladimir Getov

  • Affiliations:
  • Performance and Architecture Lab, CCS-3, LANL, New Mexico and Department of Computer Science, University of Houston, Texas;Performance and Architecture Lab, CCS-3, LANL, New Mexico and School of Computer Science, University of Westminster, London, UK

  • Venue:
  • Performance analysis and grid computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the trend in the supercomputing world shifting from homogeneous machine architectures to hybrid clusters of SMP nodes, the interoperabiility of OpenMP and MPI has become a key issue in understanding and optimizing the overall system performance. While the low-level performance of MPI and OpenMP can be evaluated using existing benchmarks, the combination of the two poses new challenges. Therefore, a performance study of different hybrid programming paradigms is of high benefit for both the vendors and the user community. As part of our project, we have identified several possible combinations of the two models in order to provide qualitative and quantitative justification of situations in which any one of them is to be favoured. Collective operations are particularly important to analyze and evaluate on a hybrid platform and therefore we concentrate our study on three of them -- barrier, all-to-all, and all-reduce. Issues like the optimal mix of OpenMP and MPI, the most efficient way of managing MPI communication from within OpenMP, the optimal unit of communication, and the degree of overlap between computation and communication need to be evaluated. The performance results supporting this investigation were taken on the IBM Power-3 machine at San Diego Supecomputer Center using our suite of hybrid microbenchmarks.