An implementation of parallel 2-d FFT using intel AVX instructions on multi-core processors

  • Authors:
  • Daisuke Takahashi

  • Affiliations:
  • Faculty of Engineering, Information and Systems, University of Tsukuba, Tsukuba, Ibaraki, Japan

  • Venue:
  • ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose an implementation of a parallel two-dimensional fast Fourier transform (FFT) using Intel Advanced Vector Extensions (AVX) instructions on multi-core processors. The combination of vectorization and a block two-dimensional FFT algorithm is shown to effectively improve performance. We vectorized FFT kernels using the AVX instructions. Performance results of two-dimensional FFTs on multi-core processors are reported. We successfully achieved a performance of over 61 GFlops on an Intel Xeon E5-2670 (2.6 GHz, two CPUs, 16 cores) and over 24 GFlops on an Intel Core i7-3930K (3.2 GHz, one CPU, six cores) for a 212×212-point FFT.