Performance Analysis of MPI Collective Operations
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15 - Volume 16
Performance Evaluation of Allgather Algorithms On Terascale Linux Cluster with Fast Ethernet
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Performance analysis of MPI collective operations
Cluster Computing
MPI collective algorithm selection and quadtree encoding
Parallel Computing
Advanced collective communication in aspen
Proceedings of the 22nd annual international conference on Supercomputing
Optimal broadcast for fully connected processor-node networks
Journal of Parallel and Distributed Computing
Process cooperation in multiple message broadcast
Parallel Computing
Two-tree algorithms for full bandwidth broadcast, reduction and scan
Parallel Computing
Accurate Heterogeneous Communication Models and a Software Tool for Their Efficient Estimation
International Journal of High Performance Computing Applications
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Two algorithms of irregular scatter/gather operations for heterogeneous platforms
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
A decomposition approach for optimizing the performance of MPI libraries
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Collective operations in NEC's high-performance MPI libraries
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Analyzing fault aware collective performance in a process fault tolerant MPI
Parallel Computing
MPI collective algorithm selection and quadtree encoding
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
An optimal broadcast algorithm adapted to SMP clusters
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Optimal broadcast for fully connected networks
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Decision trees and MPI collective algorithm selection problem
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Full bandwidth broadcast, reduction and scan with only two trees
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Process cooperation in multiple message broadcast
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Bandwidth-optimal all-to-all exchanges in fat tree networks
Proceedings of the 27th international ACM conference on International conference on supercomputing
Hi-index | 0.00 |
We discuss issues related to the high-performance implementation of collective communications operations on distributed-memory computer architectures. Using a combination of known techniques (many of which were first proposed in the 1980s and early 1990s) along with careful exploitation of communication modes supported by MPI, we have developed implementations that have improved performance in most situations compared to those currently supported by public domain implementations of MPI such as MPICH. Performance results from a large Intel Pentium 4 (R) processor cluster are included.