Science of Computer Programming - Special issue on program comprehension (IWPC '99)
Send-receive considered harmful: Myths and realities of message passing
ACM Transactions on Programming Languages and Systems (TOPLAS)
Optimization of All-to-All Communication on the Blue Gene/L Supercomputer
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Verifying Causality between Distant Performance Phenomena in Large-Scale MPI Applications
PDP '09 Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing
Transforming MPI source code based on communication patterns
Future Generation Computer Systems
Processing MPI Datatypes Outside MPI
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Group Operation Assembly Language - A Flexible Way to Express Collective Communication
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
A Pipelined Algorithm for Large, Irregular All-Gather Problems
International Journal of High Performance Computing Applications
The Scalasca performance toolset architecture
Concurrency and Computation: Practice & Experience - Scalable Tools for High-End Computing
Efficient implementation of reduce-scatter in MPI
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Parallel prefix (scan) algorithms for MPI
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Hi-index | 0.00 |
In parallel applications, a significant amount of communication occurs in a collective fashion to perform, for example, broadcasts, reductions, or complete exchanges. Although the MPI standard defines many convenience functions for this purpose, which not only improve code readability and maintenance but are usually also highly efficient, many application programmers still create their own, manual implementations using point-to-point communication. We show how instances of such hand-crafted collectives can be automatically detected. Matching pre- and post-conditions of hashed message exchanges recorded in event traces, our method is independent of the specific communication pattern employed. We demonstrate that replacing detected broadcasts in the HPL benchmark can yield significant performance improvements.