Unifying barrier and point-to-point synchronization in OpenMP with phasers

Authors:
Jun Shirako;Kamal Sharma;Vivek Sarkar
Affiliations:
Department of Computer Science, Rice University;Department of Computer Science, Rice University;Department of Computer Science, Rice University
Venue:
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Year:
2011

Citing 15
Cited 0

Synchronization using counting semaphores

ICS '88 Proceedings of the 2nd international conference on Supercomputing
The fuzzy barrier: a mechanism for high speed synchronization of processors

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Compiler optimizations for eliminating barrier synchronization

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Synchronization transformations for parallel computing

Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
OpenMP: An Industry-Standard API for Shared-Memory Programming

IEEE Computational Science & Engineering
Java Concurrency in Practice

Java Concurrency in Practice
X10: an object-oriented approach to non-uniform cluster computing

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
The design and development of ZPL

Proceedings of the third ACM SIGPLAN conference on History of programming languages
Productivity and performance using partitioned global address space languages

Proceedings of the 2007 international workshop on Parallel symbolic computation
Phasers: a unified deadlock-free construct for collective and point-to-point synchronization

Proceedings of the 22nd annual international conference on Supercomputing
Compile-Time Analysis and Specialization of Clocks in Concurrent Programs

CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Chunking parallel loops in the presence of synchronization

Proceedings of the 23rd international conference on Supercomputing
Parameterized tiling revisited

Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Reducing task creation and termination overhead in explicitly parallel programs

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Comparing the usability of library vs. language approaches to task parallelism

Evaluation and Usability of Programming Languages and Tools

Quantified Score

Hi-index	0.00

Visualization

Abstract

OpenMP is a widely used standard for parallel programing on a broad range of SMP systems. In the OpenMP programming model, synchronization points are specified by implicit or explicit barrier operations. However, certain classes of computations such as stencil algorithms need to specify synchronization only among particular tasks/threads so as to support pipeline parallelism with better synchronization efficiency and data locality than wavefront parallelism using all-to-all barriers. In this paper, we propose two new synchronization constructs in the OpenMP programming model, thread-level phasers and iteration level phasers to support various synchronization patterns such as point-to-point synchronizations and sub-group barriers with neighbor threads. Experimental results on three platforms using numerical applications show performance improvements of phasers over OpenMP barriers of up to 1.74× on an 8-core Intel Nehalem system, up to 1.59× on a 16-core Core-2-Quad system and up to 1.44× on a 32-core IBM Power7 system. It is reasonable to expect larger increases on future manycore processors.