An Adaptation of the Fast Fourier Transform for Parallel Processing
Journal of the ACM (JACM)
Numerical Analysis: A fast fourier transform algorithm for real-valued series
Communications of the ACM
A Generalization of the Fast Fourier Transform
IEEE Transactions on Computers
Fast Fourier Transforms: for fun and profit
AFIPS '66 (Fall) Proceedings of the November 7-10, 1966, fall joint computer conference
A Parallel 3-D FFT Algorithm on Clusters of Vector SMPs
PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
A Blocking Algorithm for Parallel 1-D FFT on Shared-Memory Parallel Computers
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
A Blocking Algorithm for Parallel 1-D FFT on Clusters of PCs
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
On a faster parallel implementation of the split-step Fourier method
Parallel Computing
A parallel FFT algorithm for transputer networks
Parallel Computing
Paper: Bluestein's FFT for arbitrary N on the hypercube
Parallel Computing
Multi-FFT Vectorization for the Cell Multicore Processor
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Applications of FFT and structured matrices
Algorithms and theory of computation handbook
A Fourth Order Hermitian Box-Scheme with Fast Solver for the Poisson Problem in a Square
Journal of Scientific Computing
An efficient parallel solution of complex toeplitz linear systems,
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
A hybrid MPI/OpenMP implementation of a parallel 3-d FFT on SMP clusters
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
A parallel solution of hermitian toeplitz linear systems,
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
Automatically tuned FFTs for bluegene/l's double FPU
VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
An efficient and stable parallel solution for non-symmetric toeplitz linear systems
VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
An implementation of parallel 3-d FFT using short vector SIMD instructions on clusters of PCs
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
High performance FFT on SGI Altix 3700
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
An implementation of parallel 2-d FFT using intel AVX instructions on multi-core processors
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Hi-index | 0.00 |
The adaptation of the Cooley-Tukey, the Pease and the Stockham FFT's to vector computers is discussed. Each of these algorithms computes the same result namely, the discrete Fourier transform. They differ only in the way that intermediate computations are stored. Yet it is this difference that makes one or the other more appropriate depending on the application. This difference also influences the computational efficiency on a vector computer and motivates the development of methods to improve efficiency. Each of the FFT's is defined rigorously by a short expository FORTRAN program which provides the basis for discussions about vectorization. Several methods for lengthening vectors are discussed, including the case of multiple and multi-dimensional transforms where M sequences of length N can be transformed as a single sequence of length MN using a 'truncated' FFT. The implementation of an in place FFT on a computer with memory-to-memory architecture is made possible by in place matrix-vector multiplication.