Some efficient solutions to the affine scheduling problem: I. One-dimensional time
International Journal of Parallel Programming
Compaan: deriving process networks from Matlab for embedded signal processing architectures
CODES '00 Proceedings of the eighth international workshop on Hardware/software codesign
Loop Parallelization in the Polytope Model
CONCUR '93 Proceedings of the 4th International Conference on Concurrency Theory
High Level Modeling for Parallel Executions of Nested Loop Algorithms
ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
System Design Using Kahn Process Networks: The Compaan/Laura Approach
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Translating affine nested-loop programs to process networks
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Effective automatic parallelization and locality optimization using the polyhedral model
Effective automatic parallelization and locality optimization using the polyhedral model
Automatic C-to-CUDA code generation for affine programs
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
ACM SIGARCH Computer Architecture News
Hi-index | 0.00 |
Emerging architectures in embedded space are expected to make use of a diverse mix of multicorcs, vector-based units, GPU cores and special function accelerators. In order to facilitate mapping onto diverse architectures, different models of computation have been considered. Polyhedral Process Networks (PPNs) have been extensively used in automatic generation of task and pipeline parallel programs for embedded architectures. However, the single program multiple data (SPMD) type of data parallelism has not been addressed in the PPN model. In this paper, we propose a Data Parallel View (DPV) on PPNs which introduces abstractions necessary for capturing and exploiting data parallelism on top of the PPN model. As a proof of concept, we demonstrate how a PPN can be mapped onto a modern GPU using the DPV. By complementing the native PPN support for task and pipeline parallelism with the DPV support for data parallelism, we expect to make the best use of different types of architectural components and types of parallelism on heterogeneous architectures.