Extending a C-like language for portable SIMD programming

Authors:
Roland Leißa;Sebastian Hack;Ingo Wald
Affiliations:
Saarland University, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany;Intel Corporation, Saarbrücken, Germany
Venue:
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Year:
2012

Citing 20
Cited 4

A vectorizing Fortran compiler

IBM Journal of Research and Development
Automatic translation of FORTRAN programs to vector form

ACM Transactions on Programming Languages and Systems (TOPLAS)
A language for shading and lighting calculations

SIGGRAPH '90 Proceedings of the 17th annual conference on Computer graphics and interactive techniques
Implementation of a portable nested data-parallel language

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel loop transformation techniques for vector-based multiprocessor systems

Parallel loop transformation techniques for vector-based multiprocessor systems
Exploiting superword level parallelism with multimedia instruction sets

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Compilation techniques for multimedia processors

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, Part 1
A vectorizing compiler for multimedia extensions

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, Part 1
Scheduling and Automatic Parallelization

Scheduling and Automatic Parallelization
Implementing database operations using SIMD instructions

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Conversion of control dependence to data dependence

POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Compiler-Controlled Caching in Superword Register Files for Multimedia Extension Architectures

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
A programming language

A programming language
Multi-platform Auto-vectorization

Proceedings of the International Symposium on Code Generation and Optimization
Striped Smith--Waterman speeds database searches six times over other SIMD implementations

Bioinformatics
Introducing Control Flow into Vectorized Code

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Larrabee: a many-core x86 architecture for visual computing

ACM SIGGRAPH 2008 papers
Outer-loop vectorization: revisited for short SIMD architectures

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Fast Construction of SAH BVHs on the Intel Many Integrated Core (MIC) Architecture

IEEE Transactions on Visualization and Computer Graphics
Whole-function vectorization

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization

Divergence analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
Ray tracing and volume rendering large molecular data on multi-core and many-core architectures

UltraVis '13 Proceedings of the 8th International Workshop on Ultrascale Visualization
Simple, portable and fast SIMD intrinsic programming: generic simd library

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing
Boost.SIMD: generic programming for portable SIMDization

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

SIMD instructions are common in CPUs for years now. Using these instructions effectively requires not only vectorization of code, but also modifications to the data layout. However, automatic vectorization techniques are often not powerful enough and suffer from restricted scope of applicability; hence, programmers often vectorize their programs manually by using intrinsics: compiler-known functions that directly expand to machine instructions. They significantly decrease programmer productivity by enforcing a very error-prone and hard-to-read assembly-like programming style. Furthermore, intrinsics are not portable because they are tied to a specific instruction set. In this paper, we show how a C-like language can be extended to allow for portable and efficient SIMD programming. Our extension puts the programmer in total control over where and how control-flow vectorization is triggered. We present a type system and a formal semantics of our extension and prove the soundness of the type system. Using our prototype implementation IVL that targets Intel's MIC architecture and SSE instruction set, we show that the generated code is roughly on par with handwritten intrinsic code.