A parallel 1-D FFT algorithm for the Hitachi SR8000

  • Authors:
  • Daisuke Takahashi

  • Affiliations:
  • Institute of Information Sciences and Electronics, University of Tsukuba, 1-1-1 Tennodai, Tsukuba-shi, Ibaraki 305-8573, Japan

  • Venue:
  • Parallel Computing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a high-performance parallel one-dimensional fast Fourier transform (FFT) algorithm on clusters of vector symmetric multiprocessor (SMP) nodes. The four-step FFT algorithm can be altered into a five-step FFT algorithm to expand the innermost loop length. We use the five-step algorithm to implement the parallel one-dimensional FFT algorithm. In our proposed parallel FFT algorithm, since we use cyclic distribution, all-to-all communication takes place only once. Moreover, the input data and output data are both in natural order. Performance results of one-dimensional power-of-two FFTs on clusters of pseudo-vector SMP nodes, Hitachi SR8000, are reported. We succeeded in obtaining performance of over 61 GFLOPS on a 16-node Hitachi SR8000/MPP.