A parallel 1-D FFT algorithm for the Hitachi SR8000

Authors:
Daisuke Takahashi
Affiliations:
Institute of Information Sciences and Electronics, University of Tsukuba, 1-1-1 Tennodai, Tsukuba-shi, Ibaraki 305-8573, Japan
Venue:
Parallel Computing
Year:
2003

Citing 10
Cited 5

FFTs in external or hierarchical memory

The Journal of Supercomputing
Computational frameworks for the fast Fourier transform

Computational frameworks for the fast Fourier transform
A self-sorting in-place fast Fourier transform algorithm suitable for vector and parallel processing

Numerische Mathematik
An implementation of multiple and multivariate Fourier transforms on vector processors

SIAM Journal on Scientific Computing
Real and complex fast Fourier transforms on the Fujitsu VPP 500

Parallel Computing
Fast Radix 2, 3, 4, and 5 Kernels for Fast Fourier Transformations on Computers with Overlapping Multiply--Add Instructions

SIAM Journal on Scientific Computing
An Adaptation of the Fast Fourier Transform for Parallel Processing

Journal of the ACM (JACM)
CP-PACS: a massively parallel processor at the University of Tsukuba

Parallel Computing - Special Anniversary issue
A Superscalar RISC Processor with 160 FPRs for Large Scale Scientific Processing

ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
The Fastest Fourier Transform in the West

The Fastest Fourier Transform in the West

Five-step FFT algorithm with reduced computational complexity

Information Processing Letters
Parallel implementations of 1-D fast Fourier transform without interprocessor communication

International Journal of Computers and Applications
A framework for low-communication 1-D FFT

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Tera-scale 1D FFT with low-communication algorithm and Intel® Xeon Phi™ coprocessors

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A framework for low-communication 1-D FFT

Scientific Programming - Selected Papers from Super Computing 2012

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a high-performance parallel one-dimensional fast Fourier transform (FFT) algorithm on clusters of vector symmetric multiprocessor (SMP) nodes. The four-step FFT algorithm can be altered into a five-step FFT algorithm to expand the innermost loop length. We use the five-step algorithm to implement the parallel one-dimensional FFT algorithm. In our proposed parallel FFT algorithm, since we use cyclic distribution, all-to-all communication takes place only once. Moreover, the input data and output data are both in natural order. Performance results of one-dimensional power-of-two FFTs on clusters of pseudo-vector SMP nodes, Hitachi SR8000, are reported. We succeeded in obtaining performance of over 61 GFLOPS on a 16-node Hitachi SR8000/MPP.