Communication Benchmarking and Performance Modelling of MPI Programs on Cluster Computers

Authors:
D. A. Grove;P. D. Coddington
Affiliations:
School of Computer Science, University of Adelaide, Adelaide, Australia 5005;School of Computer Science, University of Adelaide, Adelaide, Australia 5005
Venue:
The Journal of Supercomputing
Year:
2005

Citing 15
Cited 0

Communicating sequential processes

Communicating sequential processes
A theory of timed automata

Theoretical Computer Science
A comparison of two model-based performance-prediction techniques for message-passing parallel programs

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Analyzing the behavior and performance of parallel programs

Analyzing the behavior and performance of parallel programs
Interpretive performance prediction for high performance parallel computing

Interpretive performance prediction for high performance parallel computing
Realistic parallel performance estimation

Parallel Computing - Special double issue on environment and tools for parallel scientific computing
A Calculus of Communicating Systems

A Calculus of Communicating Systems
Isoefficiency: Measuring the Scalability of Parallel Algorithms and Architectures

IEEE Parallel & Distributed Technology: Systems & Technology
RSIM: Simulating Shared-Memory Multiprocessors with ILP Processors

Computer
Simics: A Full System Simulation Platform

Computer
Model-Checking for Probabilistic Real-Time Systems (Extended Abstract)

ICALP '91 Proceedings of the 18th International Colloquium on Automata, Languages and Programming
DiP: A Parallel Program Development Environment

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
Parallelism in random access machines

STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Modeling message-passing programs with a Performance Evaluating Virtual Parallel Machine

Performance Evaluation - Performance modelling and evaluation of high-performance parallel and distributed systems
SKaMPI: a comprehensive benchmark for public benchmarking of MPI

Scientific Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper gives an overview of two related tools that we have developed to provide more accurate measurement and modelling of the performance of message-passing communication and application programs on distributed memory parallel computers. MPIBench uses a very precise, globally synchronised clock to measure the performance of MPI communication routines. It can generate probability distributions of communication times, not just the average values produced by other MPI benchmarks. This allows useful insights to be made into the MPI communication performance of parallel computers, and in particular how performance is affected by network contention. The Performance Evaluating Virtual Parallel Machine (PEVPM) provides a simple, fast and accurate technique for modelling and predicting the performance of message-passing parallel programs. It uses a virtual parallel machine to simulate the execution of the parallel program. The effects of network contention can be accurately modelled by sampling from the probability distributions generated by MPIBench. These tools are particularly useful on clusters with commodity Ethernet networks, where relatively high latencies, network congestion and TCP problems can significantly affect communication performance, which is difficult to model accurately using other tools. Experiments with example parallel programs demonstrate that PEVPM gives accurate performance predictions on commodity clusters. We also show that modelling communication performance using average times rather than sampling from probability distributions can give misleading results, particularly for programs running on a large number of processors.