LogGP: incorporating long messages into the LogP model for parallel computation
Journal of Parallel and Distributed Computing
Fast Measurement of LogP Parameters for Message Passing Platforms
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Efficient collective communication in distributed heterogeneous systems
Journal of Parallel and Distributed Computing
Scheduling Algorithms for Efficient Gather Operations in Distributed Heterogeneous Systems
ICPP '00 Proceedings of the 2000 International Workshop on Parallel Processing
On optimizing collective communication
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 2
Performance analysis of MPI collective operations
Cluster Computing
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
GNU Scientific Library Reference Manual - Third Edition
GNU Scientific Library Reference Manual - Third Edition
Accurate and Efficient Estimation of Parameters of Heterogeneous Communication Performance Models
International Journal of High Performance Computing Applications
Building the communication performance model of heterogeneous clusters based on a switched network
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Revisiting communication performance models for computational clusters
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Optimization of collective communications in HeteroMPI
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Two algorithms of irregular scatter/gather operations for heterogeneous platforms
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Stochastic DAG scheduling using a Monte Carlo approach
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
In this paper, we analyze the restrictions of traditional communication performance models that affect the accuracy of analytical prediction of the execution time of collective communication operations on homogeneous and heterogeneous clusters. In particular, we show that the constant and variable contributions of processors and the network are not fully separated in these models. Full separation of the contributions that have different natures and arise from different sources would lead to more intuitive and accurate models, but the parameters of such models cannot be estimated from only the point-to-point experiments, which are usually used for traditional models. The paper presents such an intuitive and accurate point-to-point model and describes a set of communication experiments sufficient for estimation of its parameters. It also presents an implementation of the new model in the form of a software tool that automates the estimation of both this model and heterogeneous extensions of traditional communication performance models. We conclude with a presentation of experimental results demonstrating that the elaborated model much more accurately predicts the execution time of different algorithms of collective operations than traditional models.