A generalized Graeffe's iteration for evaluating polynomials and rational functions
Proceedings of the 2001 international symposium on Symbolic and algebraic computation
A Blocking Algorithm for Parallel 1-D FFT on Clusters of PCs
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
A study of the effects of machine geometry and mapping on distributed transpose performance
Proceedings of the 5th conference on Computing frontiers
The black-box fast multipole method
Journal of Computational Physics
IBM Journal of Research and Development
Performance measurements of the 3D FFT on the blue gene/l supercomputer
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
On the communication complexity of 3D FFTs and its implications for Exascale
Proceedings of the 26th ACM international conference on Supercomputing
A framework for low-communication 1-D FFT
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A framework for low-communication 1-D FFT
Scientific Programming - Selected Papers from Super Computing 2012
Hi-index | 0.01 |
It seems likely that improvements in arithmetic speed will continue to outpace advances in communication bandwidth. Furthermore, as more and more problems are working on huge datasets, it is becoming increasingly likely that data will be distributed across many processors because one processor does not have sufficient storage capacity. For these reasons, we propose that an inexact DFT such as an approximate matrix-vector approach based on singular values or a variation of the Dutt--Rokhlin fast-multipole-based algorithm may outperform any exact parallel FFT. The speedup may be as large as a factor of three in situations where FFT run time is dominated by communication. For the multipole idea we further propose that a method of "virtual charges" may improve accuracy, and we provide an analysis of the singular values that are needed for the approximate matrix-vector approaches.