Data Transfers between Processes in an SMP System: Performance Study and Application to MPI

Authors:
Darius Buntinas;Guillaume Mercier;William Gropp
Affiliations:
Argonne National Laboratory;Argonne National Laboratory;Argonne National Laboratory
Venue:
ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
Year:
2006

Citing 0
Cited 14

Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the Nemesis communication subsystem

Parallel Computing
SMARTMAP: operating system support for efficient data sharing among processes on a multi-core processor

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A Buffered-Mode MPI Implementation for the Cell BETM Processor

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Single Data Copying for MPI Communication Optimization on Shared Memory System

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
A Prototype Implementation of MPI for SMARTMAP

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Efficient shared memory and RDMA based collectives on multi-rail QsNetII SMP clusters

Cluster Computing
Efficient high performance collective communication for the cell blade

Proceedings of the 23rd international conference on Supercomputing
Process Arrival Pattern and Shared Memory Aware Alltoall on InfiniBand

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Modeling advanced collective communication algorithms on cell-based systems

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Exploiting Direct Access Shared Memory for MPI On Multi-Core Processors

International Journal of High Performance Computing Applications
Accelerating data movement on future chip multi-processors

Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
Implementation and shared-memory evaluation of MPICH2 over the nemesis communication subsystem

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
A synchronous mode MPI implementation on the cell BETM architecture

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
KNEM: A generic and scalable kernel-assisted intra-node MPI communication framework

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper focuses on the transfer of large data in SMP systems. Achieving good performance for intranode communication is critical for developing an efficient communication system, especially in the context of SMP clusters. We evaluate the performance of five transfer mechanisms: sharedmemory buffers, message queues, the Ptrace system call, kernel module-based copy, and a high-speed network. We evaluate each mechanism based on latency, bandwidth, its impact on application cache usage, and its suitability to support MPI twosided and one-sided messages.