The tangent FFT

Authors:
Daniel J. Bernstein
Affiliations:
Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, IL
Venue:
AAECC'07 Proceedings of the 17th international conference on Applied algebra, algebraic algorithms and error-correcting codes
Year:
2007

Citing 4
Cited 1

Fast fourier transforms: a tutorial review and a state of the art

Signal Processing
A New Fast Discrete Fourier Transform

Journal of VLSI Signal Processing Systems
A new matrix approach to real FFTs and convolutions of length 2k

Computing
A Modified Split-Radix FFT With Fewer Arithmetic Operations

IEEE Transactions on Signal Processing

A Simple Compressive Sensing Algorithm for Parallel Many-Core Architectures

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The split-radix FFT computes a size-n complex DFT, when n is a large power of 2, using just 4n lg n-6n+8 arithmetic operations on real numbers. This operation count was first announced in 1968, stood unchallenged for more than thirty years, and was widely believed to be best possible. Recently James Van Buskirk posted software demonstrating that the split-radix FFT is not optimal. Van Buskirk's software computes a size-n complex DFT using only (34/9 + o(1))n lg n arithmetic operations on real numbers. There are now three papers attempting to explain the improvement from 4 to 34/9: Johnson and Frigo, IEEE Transactions on Signal Processing, 2007; Lundy and Van Buskirk, Computing, 2007; and this paper. This paper presents the "tangent FFT," a straightforward in-place cache-friendly DFT algorithm having exactly the same operation counts as Van Buskirk's algorithm. This paper expresses the tangent FFT as a sequence of standard polynomial operations, and pinpoints how the tangent FFT saves time compared to the split-radix FFT. This description is helpful not only for understanding and analyzing Van Buskirk's improvement but also for minimizing the memory-access costs of the FFT.