The fast Fourier transform and its applications
The fast Fourier transform and its applications
Fast fourier transforms: a tutorial review and a state of the art
Signal Processing
Twiddle-Factor-Based FFT Algorithm with Reduced Memory Access
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Polynomial evaluation via the division algorithm the fast Fourier transform revisited
STOC '72 Proceedings of the fourth annual ACM symposium on Theory of computing
The Fastest Fourier Transform in the West
The Fastest Fourier Transform in the West
Scalable algorithms for molecular dynamics simulations on commodity clusters
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Algebraic signal processing theory: Cooley-Tukey type algorithms for real DFTs
IEEE Transactions on Signal Processing
Hi-index | 35.68 |
Optimizing the number of arithmetic operations required in fast Fourier transform (FFT) algorithms has been the focus of extensive research, but memory management is of comparable importance on modern processors. In this article, we investigate two known FFT algorithms, G and GT, that are similar to Cooley-Tukey decimation-in-time and decimation-infrequency FFT algorithms but that give an asymptotic reduction in the number of twiddle factor loads required for depth-first recursions. The algorithms also allow for aggressive vectorization (even for non-power-of-2 orders) and easier optimization of trivial twiddle factor multiplies. We benchmark G and GT implementations with comparable Cooley-Tukey implementations on commodity hardware. In a comparison designed to isolate the effect of twiddle factor access optimization, these benchmarks show typical speedups ranging from 10% to 65%, depending on transform order, precision, and vectorization. A more heavily optimized implementation of GT yields substantial performance improvements over the widely used code FFTW for many transform orders. The twiddle factor access optimization technique can be generalized to other common FFT algorithms, including real-data FFTs, split-radix FFTs, and multidimensional FFTs.