PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Array SSA form and its use in parallelization
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Termination detection in parallel loop nests with while loops
Parallel Computing
Multiprocessor mapping of process networks: a JPEG decoding case study
Proceedings of the 15th international symposium on System Synthesis
Automatic Parallelization in the Polytope Model
The Data Parallel Programming Model: Foundations, HPF Realization, and Scientific Applications
Mapping concurrent applications onto architectural platforms
Networks on chip
Guaranteeing the quality of services in networks on chip
Networks on chip
System Design Using Kahn Process Networks: The Compaan/Laura Approach
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Automatic synthesis of system on chip multiprocessor architectures for process networks
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Translating affine nested-loop programs to process networks
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Overview of the MPSoC design challenge
Proceedings of the 43rd annual Design Automation Conference
pn: a tool for improved derivation of process networks
EURASIP Journal on Embedded Systems
Parallel-stage decoupled software pipelining
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Trace-based KPN composability analysis for mapping simultaneous applications to MPSoC platforms
Proceedings of the Conference on Design, Automation and Test in Europe
The polyhedral model is more widely applicable than you think
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
A tutorial on particle filters for online nonlinear/non-GaussianBayesian tracking
IEEE Transactions on Signal Processing
Systematic and Automated Multiprocessor System Design, Programming, and Implementation
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
The Process Networks (PNs) is a suitable parallel model of computation (MoC) used to specify embedded streaming applications in a parallel form facilitating the efficient mapping onto embedded parallel execution platforms. Unfortunately, specifying an application using a parallel MoC is a very difficult and highly error-prone task. To overcome the associated difficulties, we have developed the pn compiler, which derives specific Polyhedral Process Networks (PPN) parallel specifications from sequential static affine nested loop programs (SANLPs). However, there are many applications, for example, multimedia applications (MPEG coders/decoders, smart cameras, etc.) that have adaptive and dynamic behavior which cannot be expressed as SANLPs. Therefore, in order to handle dynamic multimedia applications, in this article we address the important question whether we can relax some of the restrictions of the SANLPs while keeping the ability to perform compile-time analysis and to derive PPNs. Achieving this would significantly extend the range of applications that can be parallelized in an automated way. The main contribution of this article is a first approach for automated translation of affine nested loop programs with dynamic loop bounds into input-output equivalent Polyhedral Process Networks. In addition, we present a method for analyzing the execution overhead introduced in the PPNs derived from programs with dynamic loop bounds. The presented automated translation approach has been evaluated by deriving a PPN parallel specification from a real-life application called Low Speed Obstacle Detection (LSOD) used in the smart cameras domain. By executing the derived PPN, we have obtained results which indicate that the approach we present in this article facilitates efficient parallel implementations of sequential nested loop programs with dynamic loop bounds. That is, our approach reveals the possible parallelism available in such applications, which allows for the utilization of multiple cores in an efficient way.