The tangent FFT

  • Authors:
  • Daniel J. Bernstein

  • Affiliations:
  • Department of Mathematics, Statistics, and Computer Science, University of Illinois at Chicago, Chicago, IL

  • Venue:
  • AAECC'07 Proceedings of the 17th international conference on Applied algebra, algebraic algorithms and error-correcting codes
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The split-radix FFT computes a size-n complex DFT, when n is a large power of 2, using just 4n lg n-6n+8 arithmetic operations on real numbers. This operation count was first announced in 1968, stood unchallenged for more than thirty years, and was widely believed to be best possible. Recently James Van Buskirk posted software demonstrating that the split-radix FFT is not optimal. Van Buskirk's software computes a size-n complex DFT using only (34/9 + o(1))n lg n arithmetic operations on real numbers. There are now three papers attempting to explain the improvement from 4 to 34/9: Johnson and Frigo, IEEE Transactions on Signal Processing, 2007; Lundy and Van Buskirk, Computing, 2007; and this paper. This paper presents the "tangent FFT," a straightforward in-place cache-friendly DFT algorithm having exactly the same operation counts as Van Buskirk's algorithm. This paper expresses the tangent FFT as a sequence of standard polynomial operations, and pinpoints how the tangent FFT saves time compared to the split-radix FFT. This description is helpful not only for understanding and analyzing Van Buskirk's improvement but also for minimizing the memory-access costs of the FFT.