Parallel programming patterns for multi-processor SoC: Application to video processing

Authors:
Pierre G. Paulin;Ali Erdem Özcan;Vincent Gagné;Bruno Lavigueur;Olivier Benny
Affiliations:
STMicroelectronics Inc., Ottawa, Canada;STMicroelectronics Inc., Ottawa, Canada;STMicroelectronics Inc., Ottawa, Canada;STMicroelectronics Inc., Ottawa, Canada;STMicroelectronics Inc., Ottawa, Canada
Venue:
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
Year:
2013

Citing 4
Cited 0

Design patterns: elements of reusable object-oriented software

Design patterns: elements of reusable object-oriented software
Parallel Programming Models for Heterogeneous Multicore Architectures

IEEE Micro
Parallel programming models for a multiprocessor SoC platform applied to networking and multimedia

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Platform 2012, a many-core computing accelerator for embedded SoCs: performance evaluation of visual analytics applications

Proceedings of the 49th Annual Design Automation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Efficient, scalable and productive parallel programming is a major challenge for exploiting the future multi-processor SoC platforms. This article presents the MultiFlex programming environment which has been developed to address this challenge. It is targeted for use on Platform 2012, a scalable multi-processor fabric. The MultiFlex environment supports high-level simulation, iterative platform mapping, and includes tools for programming model aware debug, trace, visualization and analysis. This article focuses on the two classes of programming abstractions supported in MultiFlex. The first is a set of Parallel Programming Patterns (PPP) which offer a rich set of programming abstractions for implementing efficient data- and task-level parallel applications. The second is a Reactive Task Management (RTM) abstraction, which offers a lightweight C-based API to support dynamic dispatching of small grain tasks on tightly coupled parallel processing resources. The use of the MultiFlex native programming model is illustrated through the capture and mapping of two representative video applications. The first is a high-quality rescaling (HQR) application on a multi-processor platform. We present the details of the optimization process which was required for mapping the HQR application, for which the reference code requires 350 GIPS (giga instructions per second), onto a 16 processor cluster. Our results show that the parallel implementation using the PPP model offers almost linear acceleration with respect to the number of processing elements. The second application is a high-definition VC-1 decoder. For this application, we illustrate two different parallel programming model variants, one using PPPs, the other based on RTM. These two versions are mapped onto two variants of a homogeneous version of the Platform 2012 multi-core fabric.