Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Design and Evaluation of Nemesis, a Scalable, Low-Latency, Message-Passing Communication Subsystem
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Performance analysis of MPI collective operations
Cluster Computing
Hi-index | 0.00 |
Inter- and intra-machine MPI collective operations in current multicore clusters are essentially different, and therefore their performance modelling ask for different approaches. Inside a multicore each individual message transmission in a collective operation flow in parallel with others, but sharing the bandwidth of the main memory channel. Current models ignore this issue, making errors like giving the same cost estimation to quite different collective algorithms. We outline a new cost model focused on shared channels.