Two parallel implementations for one dimension FFT on symmetric multiprocessors

  • Authors:
  • Rami A. Al Na'mneh;W. David Pan;B. Earl Wells

  • Affiliations:
  • Univ. of Alabama in Huntsville;Univ. of Alabama in Huntsville;Univ. of Alabama in Huntsville

  • Venue:
  • ACM-SE 42 Proceedings of the 42nd annual Southeast regional conference
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, an empirical comparison is made between two parallel implementations of a one-dimensional Fast Fourier transform (FFT) that is targeted for a symmetric multiprocessor (SMP). The paper compares the run time characteristics and overhead (time complexity) associated with the two algorithms with that of previous research. The scalability of the two algorithms is also accessed using the isoefficiency function and the effect of caches on performance is presented. The isoefficiency function is defined as the rate at which the data should be increased with the number of processors to maintain constant efficiency. The two implementations are based on a tree and transpose, respectively. In the tree algorithm, the speedup does not increase linearly with the number of processors, but rather super linear speedup can be achieved for the two processor case. The transpose algorithm obtained (approximately) linearly speedup with respect to the number of processors with only moderate increase in the data size. Additional performance can be obtained by overlapping computation with communication and by efficient use of caches.