On the performance of concurrent transfers in collective algorithms

Authors:
Juan-Antonio Rico-Gallego;Juan-Carlos Díaz-Martín
Affiliations:
University of Extremadura, Escuela Politécnica, Cáceres, Spain;University of Extremadura, Escuela Politécnica, Cáceres, Spain
Venue:
Proceedings of the 20th European MPI Users' Group Meeting
Year:
2013

Citing 3
Cited 0

LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Design and Evaluation of Nemesis, a Scalable, Low-Latency, Message-Passing Communication Subsystem

CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Performance analysis of MPI collective operations

Cluster Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Inter- and intra-machine MPI collective operations in current multicore clusters are essentially different, and therefore their performance modelling ask for different approaches. Inside a multicore each individual message transmission in a collective operation flow in parallel with others, but sharing the bandwidth of the main memory channel. Current models ignore this issue, making errors like giving the same cost estimation to quite different collective algorithms. We outline a new cost model focused on shared channels.