Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures

Authors:
Damián A. Mallón;Guillermo L. Taboada;Carlos Teijeiro;Juan Touriño;Basilio B. Fraguela;Andrés Gómez;Ramón Doallo;J. Carlos Mouriño
Affiliations:
Galicia Supercomputing Center (CESGA), Santiago de Compostela, Spain;Computer Architecture Group, University of A Coruña, A Coruña, Spain;Computer Architecture Group, University of A Coruña, A Coruña, Spain;Computer Architecture Group, University of A Coruña, A Coruña, Spain;Computer Architecture Group, University of A Coruña, A Coruña, Spain;Galicia Supercomputing Center (CESGA), Santiago de Compostela, Spain;Computer Architecture Group, University of A Coruña, A Coruña, Spain;Galicia Supercomputing Center (CESGA), Santiago de Compostela, Spain
Venue:
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Year:
2009

Citing 8
Cited 13

UPC Benchmarking Issues

ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
UPC performance and potential: a NPB experimental study

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Performance Monitoring and Evaluation of a UPC Implementation on a NUMA Architecture

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
NPB-MPJ: NAS Parallel Benchmarks Implementation for Message-Passing in Java

PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes

PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
Performance Evaluation of Unified Parallel C Collective Communications

HPCC '09 Proceedings of the 2009 11th IEEE International Conference on High Performance Computing and Communications
An evaluation of OpenMP on current and emerging multithreaded/multicore processors

IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Hybrid MPI and OpenMP parallel programming

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface

Optimizing a parallel runtime system for multicore clusters: a case study

Proceedings of the 2010 TeraGrid Conference
A programming model performance study using the NAS parallel benchmarks

Scientific Programming - Exploring Languages for Expressing Medium to Massive On-Chip Parallelism
Hybrid programming model for implicit PDE simulations on multicore architectures

IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Automatic mapping of parallel applications on multicore architectures using the Servet benchmark suite

Computers and Electrical Engineering
Performance evaluation of OpenMP-based algorithms for handling Kronecker descriptors

Journal of Parallel and Distributed Computing
Exploring cross-layer power management for PGAS applications on the SCC platform

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Portable explicit threading and concurrent programming for MPI applications

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II
MPI-hybrid parallelism for volume rendering on large, multi-core systems

EG PGV'10 Proceedings of the 10th Eurographics conference on Parallel Graphics and Visualization
UPCBLAS: a library for parallel matrix computations in Unified Parallel C

Concurrency and Computation: Practice & Experience
PCJ - new approach for parallel computations in java

PARA'12 Proceedings of the 11th international conference on Applied Parallel and Scientific Computing
Java in the High Performance Computing arena: Research, practice and experience

Science of Computer Programming
Performance evaluation of sparse matrix products in UPC

The Journal of Supercomputing
Parallel simulation of Brownian dynamics on shared memory systems with OpenMP and Unified Parallel C

The Journal of Supercomputing

Quantified Score

Hi-index	0.01

Visualization

Abstract

The current trend to multicore architectures underscores the need of parallelism. While new languages and alternatives for supporting more efficiently these systems are proposed, MPI faces this new challenge. Therefore, up-to-date performance evaluations of current options for programming multicore systems are needed. This paper evaluates MPI performance against Unified Parallel C (UPC) and OpenMP on multicore architectures. From the analysis of the results, it can be concluded that MPI is generally the best choice on multicore systems with both shared and hybrid shared/distributed memory, as it takes the highest advantage of data locality, the key factor for performance in these systems. Regarding UPC, although it exploits efficiently the data layout in memory, it suffers from remote shared memory accesses, whereas OpenMP usually lacks efficient data locality support and is restricted to shared memory systems, which limits its scalability.