Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A Buffered-Mode MPI Implementation for the Cell BETM Processor
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Single Data Copying for MPI Communication Optimization on Shared Memory System
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
A Prototype Implementation of MPI for SMARTMAP
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Efficient high performance collective communication for the cell blade
Proceedings of the 23rd international conference on Supercomputing
Process Arrival Pattern and Shared Memory Aware Alltoall on InfiniBand
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Modeling advanced collective communication algorithms on cell-based systems
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Exploiting Direct Access Shared Memory for MPI On Multi-Core Processors
International Journal of High Performance Computing Applications
Accelerating data movement on future chip multi-processors
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
Implementation and shared-memory evaluation of MPICH2 over the nemesis communication subsystem
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
A synchronous mode MPI implementation on the cell BETM architecture
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
KNEM: A generic and scalable kernel-assisted intra-node MPI communication framework
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
This paper focuses on the transfer of large data in SMP systems. Achieving good performance for intranode communication is critical for developing an efficient communication system, especially in the context of SMP clusters. We evaluate the performance of five transfer mechanisms: sharedmemory buffers, message queues, the Ptrace system call, kernel module-based copy, and a high-speed network. We evaluate each mechanism based on latency, bandwidth, its impact on application cache usage, and its suitability to support MPI twosided and one-sided messages.