Unifying UPC and MPI runtimes: experience with MVAPICH

Authors:
Jithin Jose;Miao Luo;Sayantan Sur;Dhabaleswar K. Panda
Affiliations:
The Ohio State University;The Ohio State University;The Ohio State University;The Ohio State University
Venue:
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Year:
2010

Citing 14
Cited 5

High Performance Fortran

IEEE Parallel & Distributed Technology: Systems & Technology
High performance RDMA-based MPI implementation over InfiniBand

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
A New DMA Registration Strategy for Pinning-Based High Performance Networks

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
GASNet Specification, v1.1

GASNet Specification, v1.1
X10: an object-oriented approach to non-uniform cluster computing

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Shared memory programming for large scale machines

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Reducing Connection Memory Requirements of MPI for InfiniBand Clusters: A Message Coalescing Approach

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters

Proceedings of the 21st annual international conference on Supercomputing
Parallel Programmability and the Chapel Language

International Journal of High Performance Computing Applications
Problems with using MPI 1.1 and 2.0 as compilation targets for parallel language implementations

International Journal of High Performance Computing and Networking
Scaling communication-intensive applications on BlueGene/P using one-sided communication and overlap

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Optimizing bandwidth limited problems using one-sided communication and overlap

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Shared receive queue based scalable MPI design for infiniband clusters

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Designing a common communication subsystem

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface

Impact of over-decomposition on coordinated checkpoint/rollback protocol

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Scalable Memcached Design for InfiniBand Clusters Using Hybrid Transports

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
High performance RDMA-based design of HDFS over InfiniBand

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Performance modelling of parallel BLAST using Intel and PGI compilers on an infiniband-based HPC cluster

International Journal of Bioinformatics Research and Applications
Portable, MPI-interoperable coarray fortran

Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Unified Parallel C (UPC) is an emerging parallel programming language that is based on a shared memory paradigm. MPI has been a widely ported and dominant parallel programming model for the past couple of decades. Real-life scientific applications require a lot of investment by domain scientists. Many scientists choose the MPI programming model as it is considered low-risk. It is unlikely that entire applications will be re-written using the emerging UPC language (or PGAS paradigm) in the near future. It is more likely that parts of these applications will be converted to newer models. This requires that underlying implementation of system software be able to support both UPC and MPI simultaneously. Unfortunately, the current state-of-the-art of UPC and MPI interoperability leaves much to be desired both in terms of performance and ease-of-use. In this paper, we propose "Integrated Native Communication Runtime" (INCR) for MPI and UPC communications on InfiniBand clusters. Our library is capable of supporting both UPC and MPI communications simultaneously. This runtime is based on the widely used MVAPICH (MPI over InfiniBand) Aptus runtime, which is known to scale to tens-of-thousands of cores. Our evaluation reveals that INCR is able to deliver equal or better performance compared to the existing UPC runtime - GASNet on InfiniBand verbs. We observe that with UPC NAS benchmarks CG and MG (class B) at 128 processes, we outperform current GASNet implementation by 10% and 23%, respectively.