ARTM: a lightweight fork-join framework for many-core embedded systems

Authors:
Maroun Ojail;Raphael David;Yves Lhuillier;Alexandre Guerre
Affiliations:
CEA, LIST, Embedded Computing Laboratory, Gif-sur-Yvette, France;CEA, LIST, Embedded Computing Laboratory, Gif-sur-Yvette, France;CEA, LIST, Embedded Computing Laboratory, Gif-sur-Yvette, France;CEA, LIST, Embedded Computing Laboratory, Gif-sur-Yvette, France
Venue:
Proceedings of the Conference on Design, Automation and Test in Europe
Year:
2013

Citing 10
Cited 2

Parallel programming in Split-C

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Programming with POSIX threads

Programming with POSIX threads
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A Java fork/join framework

Proceedings of the ACM 2000 conference on Java Grande
OpenMP: parallel programming API for shared memory multiprocessors and on-chip multiprocessors

Proceedings of the 15th international symposium on System Synthesis
OpenMP: An Industry-Standard API for Shared-Memory Programming

IEEE Computational Science & Engineering
Intel® threading building blocks

Journal of Computing Sciences in Colleges
An efficient and flexible hardware support for accelerating synchronization operations on the STHORM many-core architecture

Proceedings of the Conference on Design, Automation and Test in Europe
Fast and lightweight support for nested parallelism on cluster-based embedded many-cores

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
P2012: building an ecosystem for a scalable, modular and high-efficiency embedded computing accelerator

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe

An efficient and flexible hardware support for accelerating synchronization operations on the STHORM many-core architecture

Proceedings of the Conference on Design, Automation and Test in Europe
HARS: A hardware-assisted runtime software for embedded many-core architectures

ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Embedded architectures are moving to multi-core and many-core concepts in order to sustain ever growing computing requirements within complexity and power budgets. Programming many-core architectures not only needs parallel programming skills, but also efficient exploitation of fine grain parallelism at both architecture and runtime levels. Scheduler reactivity is however increasingly important as tasks granularity is reduced, in order to keep the overhead of the scheduling to a minimum. This paper presents a lightweight fork-join framework for scheduling fine grain parallel tasks on embedded many-core systems. The asynchronous nature of the fork-join model used in this framework permits to dramatically decrease its scheduling overhead. Experimentation conducted in this paper show that the overhead induced by this framework is of 33 cycles per scheduled task. Also, we show that near-ideal speedup can be obtained by the ARTM framework for data parallel applications and that ARTM achieves better results than other state of the art parallelization techniques.