Quantifying Differences between OpenMP and MPI Using a Large-Scale Application Suite

Authors:
Brian Armstrong;Seon Wook Kim;Rudolf Eigenmann
Affiliations:
-;-;-
Venue:
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Year:
2000

Citing 4
Cited 3

The SPARC architecture manual (version 9)

The SPARC architecture manual (version 9)
Shade: a fast instruction-set simulator for execution profiling

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Benchmarking with Real Industrial Applications: The SPEC High-Performance Group

IEEE Computational Science & Engineering
Parallelization of NAS Benchmarks for Shared Memory Multiprocessore

HPCN Europe 1998 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking

SPMD OpenMP versus MPI on a IBM SMP for 3 Kernels of the NAS Benchmarks

ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Performance and programmability comparison between OpenMP and MPI implementations of a molecular modeling application

IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
OpenMP parallelism for fluid and fluid-particulate systems

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we provide quantitative information about the performance differences between the OpenMP and the MPI version of a large-scale application benchmark suite, SPECseis. We have gathered extensive performance data using hardware counters on a 4-processor Sun Enterprise system. For the presentation of this information we use a Speedup Component Model, which is able to precisely show the impact of various overheads on the program speedup. We have found that overall, the performance figures of both program versions match closely. However, our analysis also shows interesting differences in individual program phases and in overhead categories incurred. Our work gives initial answers to a largely unanswered research question: what are the sources of inefficiencies of OpenMP programs relative to other programming paradigms on large, realistic applications. Our results indicate that the OpenMP and MPI models are basically performance-equivalent on shared-memory architectures. However, we also found interesting differences in behavioral details, such as the number of instructions executed, and the incurred memory latencies and processor stalls.