MPI and OpenMP paradigms on cluster of SMP architectures: the vacancy tracking algorithm for multi-dimensional array transposition

  • Authors:
  • Yun He;Chris H. Q. Ding

  • Affiliations:
  • Lawrence Berkeley National Laboratory, University of California, Berkeley, CA;Lawrence Berkeley National Laboratory, University of California, Berkeley, CA

  • Venue:
  • Proceedings of the 2002 ACM/IEEE conference on Supercomputing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate remapping multi-dimensional arrays on cluster of SMP architectures under OpenMP, MPI, and hybrid paradigms. Traditional method of array transpose needs an auxiliary array of the same size and a copy back stage. We recently developed an in-place method using vacancy tracking cycles. The vacancy tracking algorithm outperforms the traditional 2-array method as demonstrated by extensive comparisons. The independence of vacancy tracking cycles allows efficient parallelization of the in-place method on SMP architectures at node level. Performance of multi-threaded parallelism using OpenMP are tested with different scheduling methods and different number of threads. The vacancy tracking method is parallelized using several parallel paradigms. At node level, pure OpenMP outperforms pure MPI by a factor of 2.76. Across entire cluster of SMP nodes, the hybrid MPI/OpenMP implementation outperforms pure MPI by a factor of 4.44, demonstrating the validity of the parallel paradigm of mixing MPI with OpenMP.