Towards automatic translation of OpenMP to MPI

Authors:
Ayon Basumallik;Rudolf Eigenmann
Affiliations:
Purdue University, West Lafayette, IN;Purdue University, West Lafayette, IN
Venue:
Proceedings of the 19th annual international conference on Supercomputing
Year:
2005

Citing 25
Cited 16

Analysis of interprocedural side effects in a parallel programming environment

Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Generating explicit communication from shared-memory program references

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
The high performance Fortran handbook

The high performance Fortran handbook
Parallel programming in Split-C

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Communication optimizations for irregular scientific computations on distributed memory architectures

Journal of Parallel and Distributed Computing - Special issue on scalability of parallel algorithms and architectures
Idiom recognition in the Polaris parallelizing compiler

ICS '95 Proceedings of the 9th international conference on Supercomputing
TreadMarks: Shared Memory Computing on Networks of Workstations

Computer
A Unified Framework for Optimizing Communication in Data-Parallel Programs

IEEE Transactions on Parallel and Distributed Systems
PGHPF—an optimizing High Performance Fortran compiler for distributed memory machines

Scientific Programming - Special issue: High Performance Fortran comes of age
Quantifying the performance differences between PVM and TreadMarks

Journal of Parallel and Distributed Computing
Compiler analysis of irregular memory accesses

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
OpenMP for networks of SMPs

Journal of Parallel and Distributed Computing
Extending OpenMP for NUMA machines

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Efficient and precise array access analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
The range test: a dependence test for symbolic, non-linear expressions

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Compiler Support for Array Distribution onNUMA Shared Memory Multiprocessors

The Journal of Supercomputing
An Implementation of Interprocedural Bounded Regular Section Analysis

IEEE Transactions on Parallel and Distributed Systems
Compiling Communication-Efficient Programs for Massively Parallel Machines

IEEE Transactions on Parallel and Distributed Systems
Combining dependence and data-flow analyses to optimize communication

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Compiler Analysis for Irregular Problems in Fortran D

Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
UPC performance and potential: a NPB experimental study

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
MPI: A Message-Passing Interface Standard

MPI: A Message-Passing Interface Standard
Optimizing OpenMP programs on software distributed shared memory systems

International Journal of Parallel Programming - Special issue: OpenMP: Experiences and implementations
Supporting realistic OpenMP applications on a commodity cluster of workstations

WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Compiling for a hybrid programming model using the LMAD representation

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing

Optimizing irregular shared-memory applications for distributed-memory systems

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimizing irregular shared-memory applications for clusters

Proceedings of the 22nd annual international conference on Supercomputing
OpenMP Extensions for Irregular Parallel Applications on Clusters

IWOMP '07 Proceedings of the 3rd international workshop on OpenMP: A Practical Programming Model for the Multi-Core Era
Distributed Implementation of OpenMP Based on Checkpointing Aided Parallel Execution

IWOMP '07 Proceedings of the 3rd international workshop on OpenMP: A Practical Programming Model for the Multi-Core Era
OpenMP to GPGPU: a compiler framework for automatic translation and optimization

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Runtime address space computation for SDSM systems

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Micro-benchmarks for cluster OpenMP implementations: memory consistency costs

IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Incorporation of OpenMP memory consistency into conventional dataflow analysis

IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
STEP: a distributed OpenMP for coarse-grain parallelism tool

IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Mechanisms that separate algorithms from implementations for parallel patterns

Proceedings of the 2010 Workshop on Parallel Programming Patterns
Productive cluster programming with OmpSs

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Network-theoretic classification of parallel computation patterns

International Journal of High Performance Computing Applications
Toward a distributed implementation of openMP using CAPE

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Strategies and implementation for translating OpenMP code for clusters

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Checkpointing aided parallel execution model and analysis

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Multiclass classification of distributed memory parallel computations

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present compiler techniques for translating OpenMP shared-memory parallel applications into MPI message-passing programs for execution on distributed memory systems. This translation aims to extend the ease of creating parallel applications with OpenMP to a wider variety of platforms, such as commodity cluster systems. We present key concepts and describe techniques to analyze and efficiently handle both regular and irregular accesses to shared data.We evaluate the performance achieved by our translation scheme on seven representative OpenMP applications, two from SPEC OMPM2001 and five from the NAS Parallel Benchmarks suite, on two different platforms. The average scalability (execution time relative to the serial version) achieved is within 12% of that achieved by corresponding hand-tuned MPI applications. We also compare our programs with versions deployed for a Software Distributed Shared Memory (SDSM) system and find that the direct translation to MPI achieves up to 30% higher scalability. A comparison with High Performance Fortran (HPF) versions of two NAS benchmarks indicates that our translated OpenMP versions achieve 12% to 89% better performance than the HPF versions.