Fast fourier transforms: a tutorial review and a state of the art
Signal Processing
Computational frameworks for the fast Fourier transform
Computational frameworks for the fast Fourier transform
The Fastest Fourier Transform in the West
The Fastest Fourier Transform in the West
A decoupled scheduling approach for the GrADS program development environment
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A decoupled scheduling approach for Grid application development environments
Journal of Parallel and Distributed Computing - Special issue on computational grids
Communications overlapping in fast multipole particle dynamics methods
Journal of Computational Physics
Statistical Models for Empirical Search-Based Performance Tuning
International Journal of High Performance Computing Applications
Scheduling FFT computation on SMP and multicore systems
Proceedings of the 21st annual international conference on Supercomputing
Automatic Generation of FFT for Translations of Multipole Expansions in Spherical Harmonics
International Journal of High Performance Computing Applications
CODELAB: a develpers' tool for efficient code generation and optimization
ICCS'03 Proceedings of the 2003 international conference on Computational science
Adaptive computation of self sorting in-place FFTs on hierarchical memory architectures
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Hi-index | 0.00 |
In this paper we present an adaptive and portable software library for the fast Fourier transform (FFT). The library consists of a number of composable blocks of code called codelets, each computing a part of the transform. The actual FFT algorithm used by the code is determined at run-time by selecting the fastest strategy among all possible strategies, given available codelets, for a given transform size. We also present an efficient automatic method of generating the library modules by using a special-purpose compiler. The code generator is written in C and it generates a library of C codelets. The code generator is shown to be flexible and extensible and the entire library can be generated in a matter of seconds. We have evaluated the library for performance on the IBM-SP2, SG1-2000, HP-Exemplar and Intel Pentium systems. We use the results from these evaluations to build performance models for the FFT library on different platforms. The library is shown to be portable, adaptive and efficient.