CLOMP: accurately characterizing OpenMP application overheads

Authors:
Greg Bronevetsky;John Gyllenhaal;Bronis R. De Supinski
Affiliations:
Computation Directorate, Lawrence Livermore National Laboratory, Livermore, CA;Computation Directorate, Lawrence Livermore National Laboratory, Livermore, CA;Computation Directorate, Lawrence Livermore National Laboratory, Livermore, CA
Venue:
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Year:
2008

Citing 7
Cited 4

Efficient management of parallelism in object-oriented numerical software libraries

Modern software tools for scientific computing
Flash code: studying astrophysical thermonuclear flashes

Computing in Science and Engineering
NAMD: biomolecular simulation on thousands of processors

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Scalable Line Dynamics in ParaDiS

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Improving the computational intensity of unstructured mesh applications

Proceedings of the 19th annual international conference on Supercomputing
Large-Scale First-Principles Molecular Dynamics simulations on the BlueGene/L Platform using the Qbox code

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
The OpenMP memory model

IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming

OpenMP tasking analysis for programmers

CASCON '09 Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research
On-chip communication and synchronization mechanisms with cache-integrated network interfaces

Proceedings of the 7th ACM international conference on Computing frontiers
Using hybrid parallelism to improve memory use in the Uintah framework

Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery
What scientific applications can benefit from hardware transactional memory?

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Despite its ease of use, OpenMP has failed to gainwidespread use on large scale systems, largely due to its failure to deliversufficient performance. Our experience indicates that the cost ofinitiating OpenMP regions is simply too high for the desired OpenMPusage scenario of many applications. In this paper, we introduce CLOMP,a new benchmark to characterize this aspect of OpenMP implementationsaccurately. CLOMP complements the existing EPCC benchmarksuite to provide simple, easy to understand measurements of OpenMPoverheads in the context of application usage scenarios. Our results forseveral OpenMP implementations demonstrate that CLOMP identifiesthe amount of work required to compensate for the overheads observedwith EPCC. Further, we show that CLOMP also captures limitations forOpenMP parallelization on NUMA systems.