An Adaptation of the Fast Fourier Transform for Parallel Processing
Journal of the ACM (JACM)
A Parallel Algorithm for 2-D DFT Computation with No Interprocessor Communication
IEEE Transactions on Parallel and Distributed Systems
A Parallel Implementation of the Fast Fourier Transform Algorithm
IEEE Transactions on Computers
Self-sorting in-place FFT algorithm with minimum working space
IEEE Transactions on Signal Processing
A framework for low-communication 1-D FFT
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A framework for low-communication 1-D FFT
Scientific Programming - Selected Papers from Super Computing 2012
Hi-index | 0.00 |
Many parallel versions of the Cooley-Tukey FFT algorithm have been proposed. Most of these algorithms deal with FFT computation as multiple stage calculations with data permutation between stages. This requires extensive interprocessor communications. To reduce or eliminate the interprocessor communications, Gertner, Tolimieri and their colleagues [9-12] proposed a new M-D FFT algorithm, called the Reduced Transform Algorithm(RTA). In this paper, we will extend the idea of RTA to the M-D Cooley-Tukey (C-T) FFT algorithm and M-D Good-Thomas (G-T) prime factor algorithm. A new implementation strategy of these algorithms will be discussed which requires no interprocessor communication. Finally, a hybrid algorithm which combines the C-T or G-T algorithm with RTA will be described.