Locality-improved FFT implementation on a graphics processor

  • Authors:
  • Sergio Romero;Maria A. Trenas;Eladio Gutierrez;Emilio L. Zapata

  • Affiliations:
  • Department of Computer Architecture, University of Málaga, Málaga, Spain;Department of Computer Architecture, University of Málaga, Málaga, Spain;Department of Computer Architecture, University of Málaga, Málaga, Spain;Department of Computer Architecture, University of Málaga, Málaga, Spain

  • Venue:
  • ISCGAV'07 Proceedings of the 7th WSEAS International Conference on Signal Processing, Computational Geometry & Artificial Vision
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The growing computational power of modern graphics processing units is making them very suitable for general purpose computing. These commodity processors operate generally as parallel SIMD platforms and, among other factors, the effectiveness of the codes is subject to a right exploitation of the underlying memory hierarchy. This paper deals with the implementation of the Fast Fourier Transform on a novel graphics architecture offered recently by NVIDIA. Such an implementation takes into consideration memory reference locality issues, that are crucial when pursuing a high degree of parallelism, that is, a good occupancy of the processing elements. The proposed implementation has been tested and compared to the manufacturer's own implementation.