Memory Locality Exploitation Strategies for FFT on the CUDA Architecture

Authors:
Eladio Gutierrez;Sergio Romero;Maria A. Trenas;Emilio L. Zapata
Affiliations:
Department of Computer Architecture, University of Malaga, Malaga, Spain 29071;Department of Computer Architecture, University of Malaga, Malaga, Spain 29071;Department of Computer Architecture, University of Malaga, Malaga, Spain 29071;Department of Computer Architecture, University of Malaga, Malaga, Spain 29071
Venue:
High Performance Computing for Computational Science - VECPAR 2008
Year:
2008

Citing 3
Cited 2

The FFT on a GPU

Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
FFT and Convolution Performance in Image Filtering on GPU

IV '06 Proceedings of the conference on Information Visualization
A memory model for scientific algorithms on graphics processors

Proceedings of the 2006 ACM/IEEE conference on Supercomputing

Bandwidth intensive 3-D FFT kernel for GPUs using CUDA

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
GPU-based FFT computation for multi-gigabit wirelessHD baseband processing

EURASIP Journal on Wireless Communications and Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern graphics processing units (GPU) are becoming more and more suitable for general purpose computing due to its growing computational power. These commodity processors follow, in general, a parallel SIMD execution model whose efficiency is subject to a right exploitation of the explicit memory hierarchy, among other factors. In this paper we analyze the implementation of the Fast Fourier Transform using the programming model of the Compute Unified Device Architecture (CUDA) recently released by NVIDIA for its new graphics platforms. Within this model we propose an FFT implementation that takes into account memory reference locality issues that are crucial in order to achieve a high execution performance. This proposal has been experimentally tested and compared with other well known approaches such as the manufacturer's FFT library.