Algorithmic species: A classification of affine loop nests for parallel programming

Authors:
Cedric Nugteren;Pieter Custers;Henk Corporaal
Affiliations:
Eindhoven University of Technology, The Netherlands;Eindhoven University of Technology, The Netherlands;Eindhoven University of Technology, The Netherlands
Venue:
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Year:
2013

Citing 19
Cited 0

Automatic translation of FORTRAN programs to vector form

ACM Transactions on Programming Languages and Systems (TOPLAS)
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Algorithmic skeletons: structured management of parallel computation

Algorithmic skeletons: structured management of parallel computation
The I Test: An Improved Dependence Test for Automatic Parallelization and Vectorization

IEEE Transactions on Parallel and Distributed Systems
Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time

Proceedings of the International Symposium on Code Generation and Optimization
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Patterns for parallel programming

Patterns for parallel programming
Roofline: an insightful visual performance model for multicore architectures

Communications of the ACM - A Direct Path to Dependable Software
Deriving Efficient Data Movement from Decoupled Access/Execute Specifications

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
A view of the parallel computing landscape

Communications of the ACM - A View of Parallel Computing
Implementing the PGI Accelerator model

Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
SkePU: a multi-backend skeleton programming library for multi-GPU systems

Proceedings of the fourth international workshop on High-level parallel programming and applications
Algorithmic skeletons for stream programming in embedded heterogeneous parallel image processing applications

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Computing Performance: Game Over or Next Level?

Computer
The tao of parallelism in algorithms

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
An idiom-finding tool for increasing productivity of accelerators

Proceedings of the international conference on Supercomputing
Introducing 'Bones': a parallelizing source-to-source compiler based on algorithmic skeletons

Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
The boat hull model: enabling performance prediction for parallel computing prior to code development

Proceedings of the 9th conference on Computing Frontiers
Generating GPU code from a high-level representation for image processing kernels

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Code generation and programming have become ever more challenging over the last decade due to the shift towards parallel processing. Emerging processor architectures such as multi-cores and GPUs exploit increasingly parallelism, requiring programmers and compilers to deal with aspects such as threading, concurrency, synchronization, and complex memory partitioning. We advocate that programmers and compilers can greatly benefit from a structured classification of program code. Such a classification can help programmers to find opportunities for parallelization, reason about their code, and interact with other programmers. Similarly, parallelising compilers and source-to-source compilers can take threading and optimization decisions based on the same classification. In this work, we introduce algorithmic species, a classification of affine loop nests based on the polyhedral model and targeted for both automatic and manual use. Individual classes capture information such as the structure of parallelism and the data reuse. To make the classification applicable for manual use, a basic vocabulary forms the base for the creation of a set of intuitive classes. To demonstrate the use of algorithmic species, we identify 115 classes in a benchmark set. Additionally, we demonstrate the suitability of algorithmic species for automated uses by showing a tool to automatically extract species from program code, a species-based source-to-source compiler, and a species-based performance prediction model.