Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Fine-Grained Data Distribution Operations for Particle Codes
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Scalable communication protocols for dynamic sparse data exchange
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Kanor: a declarative language for explicit communication
PADL'11 Proceedings of the 13th international conference on Practical aspects of declarative languages
Cosmic microwave background map-making at the petascale and beyond
Proceedings of the international conference on Supercomputing
Kernel-based offload of collective operations: implementation, evaluation and lessons learned
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Optimization principles for collective neighborhood communications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
mpicroscope: towards an MPI benchmark tool for performance guideline verification
EuroMPI'12 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface
Bandwidth-optimal all-to-all exchanges in fat tree networks
Proceedings of the 27th international ACM conference on International conference on supercomputing
Hi-index | 0.00 |
We discuss issues in designing sparse (nearest neighbor) collective operations for communication and reduction operations in small neighborhoods for the Message Passing Interface (MPI).We propose three such operations, namely a sparse gather operation, a sparse all-to-all, and a sparse reduction operation in both regular and irregular (vector) variants. By two simple experiments we show a) that a collective handle for message scheduling and communication optimization is necessary for any such interface, b) that the possibly different amount of communication between neighbors need to be taken into account by the optimization, and c) illustrate the improvements that are possible by schedules that posses global information compared to implementations that can rely on only local information. We discuss different forms the interface and optimization handles could take. The paper is inspired by current discussion in the MPI Forum.