KPN2GPU: an approach for discovery and exploitation of fine-grain data parallelism in process networks

Authors:
Ana Balevic;Bart Kienhuis
Affiliations:
University of Leiden, Leiden, The Netherlands;University of Leiden, Leiden, The Netherlands
Venue:
ACM SIGARCH Computer Architecture News
Year:
2011

Citing 9
Cited 0

Some efficient solutions to the affine scheduling problem: I. One-dimensional time

International Journal of Parallel Programming
Compaan: deriving process networks from Matlab for embedded signal processing architectures

CODES '00 Proceedings of the eighth international workshop on Hardware/software codesign
Scheduling and Automatic Parallelization

Scheduling and Automatic Parallelization
Loop Parallelization in the Polytope Model

CONCUR '93 Proceedings of the 4th International Conference on Concurrency Theory
System Design Using Kahn Process Networks: The Compaan/Laura Approach

Proceedings of the conference on Design, automation and test in Europe - Volume 1
Scalable and structured scheduling

International Journal of Parallel Programming
A GPGPU compiler for memory optimization and parallelism management

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
A data parallel view on polyhedral process networks

Proceedings of the 14th International Workshop on Software and Compilers for Embedded Systems
Automatic C-to-CUDA code generation for affine programs

CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction

Quantified Score

Hi-index	0.00

Visualization

Abstract

With advances in manycore and accelerator architectures, the high performance and embedded spaces are rapidly converging. Emerging architectures feature different forms of parallelism. The Polyhedral Processes Networks (PPNs) are a proven model of choice for automated generation of pipeline and task parallel programs from sequential source code, however data parallelism is not addressed. In this paper, we present asystematic approach for identification and extraction of fine grain data parallelism from the PPN specification. The approach is implemented in a tool, called kpn2gpu, which produces fine-grain data parallel CUDA kernels for graphics processing units (GPUs). First experiments indicate that generated applications have a potential to exploit different forms of parallelism provided by the architecture and that kernels feature a highly regular structure that allows subsequent optimizations.