Automatic SIMD vectorization of chains of recurrences

  • Authors:
  • Yixin Shou;Robert A. van Engelen

  • Affiliations:
  • Florida State University, Tallahassee, FL, USA;Florida State University, Tallahassee, FL, USA

  • Venue:
  • Proceedings of the 22nd annual international conference on Supercomputing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many computational tasks require repeated evaluation of functions over structured grids, such as plotting in a coordinate system, rendering of parametric objects in 2D and 3D, numerical grid generation, and signal processing. In this paper, we present a method and toolset to speed up closed-form function evaluations over grids by vectorizing Chains of Recurrences (CR). CR forms of closed-form functions require fewer operations to evaluate per grid point. However, the present CR formalism makes CR forms inherently non-vectorizable due to the dependences carried from one point to the next. To address this limitation, we developed a new decoupling method for the CR algebra to translate math functions into Vector Chains of Recurrences (VCR) forms. The VCR coefficients are packed in short vector registers for efficient execution. Performance results of benchmark functions evaluated in single and double precision VCR forms are compared to the Intel compiler's auto-vectorized code and the high-performance small vector math library (SVML). The results show a significant performance increase of our VCR method over SVML and scalar CRs, from doubling the execution speed to running an order of magnitude faster. An auto-tuning tool for VCR is developed for optimal performance and accuracy.