Paper: Bluestein's FFT for arbitrary N on the hypercube

  • Authors:
  • Paul N. Swarztrauber;Roland A. Sweet;William L. Briggs;Van Emden Henson;James Otto

  • Affiliations:
  • National Center for Atmospheric Research**, Boulder, CO 80307, USA;Computational Mathematics Group, Department of Mathematics, University of Colorado at Denver, Denver, CO 80204, USA;Computational Mathematics Group, Department of Mathematics, University of Colorado at Denver, Denver, CO 80204, USA;Department of Mathematics, Naval Postgraduate Scool***, Monterey, CA 93933, USA;Computational Mathematics Group, Department of Mathematics, University of Colorado at Denver, Denver, CO 80204, USA

  • Venue:
  • Parallel Computing
  • Year:
  • 1991

Quantified Score

Hi-index 0.00

Visualization

Abstract

The original Cooley-Tukey FFT was published in 1965 and presented for sequences with length N equal to a power of two. However, in the same paper they noted that their algorithm could be generalized to composite N in which the length of the sequence was a product of small primes. In 1967, Bergland presented an algorithm for composite N and variants of his mixed radix FFT are currently in wide use. In 1968, Bluestein presented an FFT for arbitrary N including large primes. However, for composite N, Bluestein's FFT was not competitive with Bergland's FFT. Since it is usually possible to select a composite N, Bluestein's FFT did not receive much attention. Nevertheless because of its minimal communication requirements, the Bluestein FFT may be the algorithm of choice on multiprocessors, particularly those with the hypercube architecture. In contrast to the mixed radix FFT, the communication pattern of the Bluestein FFT maps quite well onto the hypercube. With P = 2^d processors, an ordered Bluestein FFT requires 2d communication cycles with packet length N/2P which is comparable to the requirements of a power of two FFT. For fine-grain computations, the Bluestein FFT requires 20log"2N computational cycles. Although this is double that required for a mixed radix FFT, the Bluestein FFT may nevertheless be preferred because of its lower communication costs. For most values of N it is also shown to be superior to another alternative, namely parallel multiplication.