Lightweight barrier-based parallelization support for non-cache-coherent MPSoC platforms

Authors:
Andrea Marongiu;Luca Benini;Mahmut Kandemir
Affiliations:
University of Bologna;University of Bologna;Penn State University
Venue:
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Year:
2007

Citing 12
Cited 3

Algorithms for scalable synchronization on shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Global optimizations for parallelism and locality on scalable parallel machines

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Data and computation transformations for multiprocessors

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Evaluating synchronization on shared address space multiprocessors: methodology and performance

SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An algorithm for mapping loops onto coarse-grained reconfigurable architectures

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Exploiting Barriers to Optimize Power Consumption of CMPs

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
The Thrifty Barrier: Energy-Aware Synchronization in Shared-Memory Multiprocessors

HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
Fast synchronization for chip multiprocessors

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Dynamic partitioning of processing and memory resources in embedded MPSoC architectures

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture

Efficiency and scalability of barrier synchronization on NoC based many-core architectures

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Distributed and low-power synchronization architecture for embedded multiprocessors

CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Efficient OpenMP support and extensions for MPSoCs with explicitly managed memory hierarchy

Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many MPSoC applications are loop-intensive and amenable to automatic parallelization with suitable compiler support. One of the key components of any compiler-parallelized code is barrier instructions which are used to perform global synchronization across parallel processors. This scenario calls for a lightweight synchronization infrastructure. In this work we describe a lightweight barrier support library for a non-cache-coherent MPSoC architecture. The library is coupled with a parallelizing compiler front-end to set up a complete automated flow which, starting from a sequential code, produces the parallelized binary code that can be directly executed onto an MPSoC target (a multi-core non-cache-coherent ARM7 platform). This tool-flow has been characterized in terms of system performance and energy.