An Evaluation of Vectorizing Compilers

Authors:
Saeed Maleki;Yaoqing Gao;Maria J. Garzarán;Tommy Wong;David A. Padua
Affiliations:
-;-;-;-;-
Venue:
PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Year:
2011

Citing 0
Cited 12

Extending OpenMP* with vector constructs for modern multicore SIMD architectures

IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Socio-PLT: principles for programming language adoption

Proceedings of the ACM international symposium on New ideas, new paradigms, and reflections on programming and software
From relational verification to SIMD loop synthesis

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Breaking SIMD shackles with an exposed flexible microarchitecture and the access execute PDG

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Automatic vectorization of tree traversals

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Generating data transfers for distributed GPU parallel programs

Journal of Parallel and Distributed Computing
On the advantage of time-varying diversity of workload on functionally asymmetric multi-core

Proceedings of International Workshop on Adaptive Self-tuning Computing Systems
Easy, fast, and energy-efficient object detection on heterogeneous on-chip architectures

ACM Transactions on Architecture and Code Optimization (TACO)
Simple, portable and fast SIMD intrinsic programming: generic simd library

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing
Exploring the vectorization of python constructs using pythran and boost SIMD

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing
OpenCL framework for ARM processors with NEON support

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing
Vector seeker: a tool for finding vector potential

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most of today's processors include vector units that have been designed to speedup single threaded programs. Although vector instructions can deliver high performance, writing vector code in assembly language or using intrinsics in high level languages is a time consuming and error-prone task. The alternative is to automate the process of vectorization by using vectorizing compilers. This paper evaluates how well compilers vectorize a synthetic benchmark consisting of 151 loops, two application from Petascale Application Collaboration Teams (PACT), and eight applications from Media Bench II. We evaluated three compilers: GCC (version 4.7.0), ICC (version 12.0) and XLC (version 11.01). Our results show that despite all the work done in vectorization in the last 40 years 45-71% of the loops in the synthetic benchmark and only a few loops from the real applications are vectorized by the compilers we evaluated.