Automatic parallelization of embedded software using hierarchical task graphs and integer linear programming

Authors:
Daniel Cordes;Peter Marwedel;Arindam Mallik
Affiliations:
Informatik Centrum Dortmund e.V., Dortmund, Germany;Informatik Centrum Dortmund e.V., Dortmund, Germany;Imec Belgium, Leuven, Belgium
Venue:
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Year:
2010

Citing 19
Cited 6

The hierarchical task graph and its use in auto-scheduling

ICS '91 Proceedings of the 5th international conference on Supercomputing
Automatic partitioning of a program dependence graph into parallel tasks

IBM Journal of Research and Development
The hierarchical task graph as a universal intermediate representation

International Journal of Parallel Programming
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Pthreads programming

Pthreads programming
Data distribution support on distributed shared memory multiprocessors

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Advanced compiler design and implementation

Advanced compiler design and implementation
An integer linear programming based approach for parallelizing applications in On-chip multiprocessors

Proceedings of the 39th annual Design Automation Conference
Partitioning and Scheduling Parallel Programs for Multiprocessors

Partitioning and Scheduling Parallel Programs for Multiprocessors
MPI-The Complete Reference, Volume 1: The MPI Core

MPI-The Complete Reference, Volume 1: The MPI Core
Maximizing Multiprocessor Performance with the SUIF Compiler

Computer
Compiler parallelization of C programs for multi-core DSPs with multiple address spaces

Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
MPARM: Exploring the Multi-Processor SoC Design Space with SystemC

Journal of VLSI Signal Processing Systems
Automatic Thread Extraction with Decoupled Software Pipelining

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
pn: a tool for improved derivation of process networks

EURASIP Journal on Embedded Systems
MAPS: an integrated framework for MPSoC application parallelization

Proceedings of the 45th annual Design Automation Conference
Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Versatile system-level memory-aware platform description approach for embedded MPSoCs

Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
Exploring parallelizations of applications for MPSoC platforms using MPA

Proceedings of the Conference on Design, Automation and Test in Europe

MNEMEE: a framework for memory management and optimization of static and dynamic data in MPSoCs

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Automatic extraction of multi-objective aware pipeline parallelism using genetic algorithms

Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A pattern-supported parallelization approach

Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores
Designer-in-the-loop recoding of ESL models using static parallel access conflict analysis

Proceedings of the 16th International Workshop on Software and Compilers for Embedded Systems
Multi-objective aware extraction of task-level parallelism using genetic algorithms

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Compiling Scilab to high performance embedded multicore systems

Microprocessors & Microsystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The last years have shown that there is no way to disregard the advantages provided by multiprocessor System-on-Chip (MPSoC) architectures in the embedded systems domain. Using multiple cores in a single system enables to close the gap between energy consumption, problems concerning heat dissipation, and computational power. Nevertheless, these benefits do not come for free. New challenges arise, if existing applications have to be ported to these multiprocessor platforms. One of the most ambitious tasks is to extract efficient parallelism from these existing sequential applications. Hence, many parallelization tools have been developed, most of them are extracting as much parallelism as possible, which is in general not the best choice for embedded systems with their limitations in hardware and software support. In contrast to previous approaches, we present a new automatic parallelization tool, tailored to the particular requirements of the resource constrained embedded systems. Therefore, this paper presents an algorithm which automatically steers the granularity of the generated tasks, with respect to architectural requirements and the overall execution time reduction. For this purpose, we exploit hierarchical task graphs to simplify a new integer linear programming based approach in order to split up sequential programs in an efficient way. Results on real-life benchmarks have shown that the presented approach is able to speed sequential applications up by a factor of up to 3.7 on a four core MPSoC architecture.