On non-blocking collectives in 3D FFTs

  • Authors:
  • Radhika S. Saksena

  • Affiliations:
  • Fujitsu Laboratories of Europe, Hayes, United Kingdom

  • Venue:
  • Proceedings of the second workshop on Scalable algorithms for large-scale systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the inclusion of non-blocking global collective operations in the MPI 3.0 draft specification many fundamental algorithms such as those for performing 3-dimensional (3D) FFTs will be modified to take advantage of non-blocking collectives. Novel modifications to such fundamental algorithms will need to be suitable for incorporation in general-purpose FFT libraries to be routinely used by HPC application users. Here we present such a general-purpose algorithmic strategy to utilize non-blocking collective communications in the calculation of a single parallel 3D FFT. In this scheme, the global collective communication is partitioned into blocking and non-blocking components such that overlap between communication and computation is obtained in the 3D FFT calculation. We present benchmarks of our scheme for overlapping computation and communication in the calculation of single variable 3D FFTs on two different architectures (a) HECToR, a Cray XE6 machine and (b) a Fujitsu PRIMERGY Intel Westmere cluster with InfiniBand interconnect.