Exploiting just-enough parallelism when mapping streaming applications in hard real-time systems

Authors:
Jiali Teddy Zhai;Mohamed A. Bamakhrama;Todor Stefanov
Affiliations:
Leiden University, Leiden, The Netherlands;Leiden University, Leiden, The Netherlands;Leiden University, Leiden, The Netherlands
Venue:
Proceedings of the 50th Annual Design Automation Conference
Year:
2013

Citing 16
Cited 1

Static scheduling of synchronous data flow programs for digital signal processing

IEEE Transactions on Computers
Knapsack problems: algorithms and computer implementations

Knapsack problems: algorithms and computer implementations
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment

Journal of the ACM (JACM)
Æthereal Network on Chip: Concepts, Architectures, and Implementations

IEEE Design & Test
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Orchestrating the execution of stream programs on multicore platforms

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
A computing origami: folding streams in FPGAs

Proceedings of the 46th Annual Design Automation Conference
Electronic system-level synthesis methodologies

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
An empirical characterization of stream programs and its implications for language and compiler design

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Pipelined data parallel task mapping/scheduling technique for MPSoC

Proceedings of the Conference on Design, Automation and Test in Europe
Scheduling Parallel Real-Time Tasks on Multi-core Processors

RTSS '10 Proceedings of the 2010 31st IEEE Real-Time Systems Symposium
Orchestration by approximation: mapping stream programs onto multicore architectures

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Hard-real-time scheduling of data-dependent tasks in embedded streaming applications

EMSOFT '11 Proceedings of the ninth ACM international conference on Embedded software
Cycle-static dataflow

IEEE Transactions on Signal Processing
A fast and elitist multiobjective genetic algorithm: NSGA-II

IEEE Transactions on Evolutionary Computation
Mapping of streaming applications considering alternative application specifications

ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems

Combining computation and communication optimizations in system synthesis for streaming applications

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

Embedded streaming applications specified using parallel Models of Computation (MoC) often contain ample amount of parallelism which can be exploited using Multi-Processor System-on-Chip (MPSoC) platforms. It has been shown that the various forms of parallelism in an application should be explored to achieve the maximum system performance. However, if more parallelism is revealed than needed, it will overload the underlying MPSoC platform. At the same time, the revealed parallelism should be sufficient such that the MPSoC platform is fully utilized. Therefore, the amount of revealed and exploited parallelism has to be just-enough with respect to the platform constraints. In this paper, we study the problem of exploiting just-enough parallelism by application task unfolding, when mapping streaming applications modeled using the Synchronous Data Flow (SDF) MoC onto MPSoC platforms in hard real-time systems. We show that our problem of simultaneously unfolding and allocating tasks under hard real-time scheduling has a bounded solution space and derive its upper bounds. Subsequently, we devise an efficient algorithm to solve the problem, while the obtained solution meets a pre-specified quality. The experiments on a set of real-life streaming applications demonstrate that our algorithm results, within reasonable amount of time, in a system specification with large performance gain. Finally, we show that our proposed algorithm is on average 100 times faster than one of the state-of-the-art meta-heuristics, i.e., NSGA-II genetic algorithm, while achieving the same quality of solutions.