Auto-vectorization of interleaved data for SIMD

  • Authors:
  • Dorit Nuzman;Ira Rosen;Ayal Zaks

  • Affiliations:
  • IBM Haifa Labs, Haifa, Israel;IBM Haifa Labs, Haifa, Israel;IBM Haifa Labs, Haifa, Israel

  • Venue:
  • Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most implementations of the Single Instruction Multiple Data (SIMD) model available today require that data elements be packed in vector registers. Operations on disjoint vector elements are not supported directly and require explicit data reorganization manipulations. Computations on non-contiguous and especially interleaved data appear in important applications, which can greatly benefit from SIMD instructions once the data is reorganized properly. Vectorizing such computations efficiently is therefore an ambitious challenge for both programmers and vectorizing compilers. We demonstrate an automatic compilation scheme that supports effective vectorization in the presence of interleaved data with constant strides that are powers of 2, facilitating data reorganization. We demonstrate how our vectorization scheme applies to dominant SIMD architectures, and present experimental results on a wide range of key kernels, showing speedups in execution time up to 3.7 for interleaving levels (stride) as high as 8.