Two algorithms for barrier synchronization
International Journal of Parallel Programming
Optimization of MPI collective communication on BlueGene/L systems
Proceedings of the 19th annual international conference on Supercomputing
Design of High Performance MVAPICH2: MPI2 over InfiniBand
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Performance without pain = productivity: data layout and collective communication in UPC
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Implementation and performance analysis of non-blocking collective operations for MPI
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Proceedings of the 22nd annual international conference on Supercomputing
Overview of the IBM Blue Gene/P project
IBM Journal of Research and Development
Overview of the Blue Gene/L system architecture
IBM Journal of Research and Development
Design and implementation of message-passing services for the Blue Gene/L supercomputer
IBM Journal of Research and Development
Efficient implementation of allreduce on bluegene/l collective network
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
International Journal of High Performance Computing Applications
Versatile communication algorithms for data analysis
EuroMPI'12 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface
Hi-index | 0.00 |
Different programming paradigms utilize a variety of collective communication operations, often with different semantics. We present the component collective messaging interface (CCMI), that can support asynchronous non-blocking collectives and is extensible to different programming paradigms and architectures. CCMI is designed with components written in the C++ programming language, allowing it to have reuse and extendability. Collective algorithms are embodied in topological schedulesand executorsthat execute them. Portability across architectures is enabled by the multisend data movement component. CCMI includes a programming language adaptor used to implement different APIs with different semantics for different paradigms. We study the effectiveness of CCMI on Blue Gene/P and evaluate its performance for the barrier, broadcast, and allreduce collective operations. We also present the performance of the barrier collective on the Abe Infiniband cluster.