LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
LogGP: incorporating long messages into the LogP model for parallel computation
Journal of Parallel and Distributed Computing
COMB: A Portable Benchmark Suite for Assessing MPI Overlap
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
An Evaluation of Current High-Performance Networks
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
The Impact of MPI Queue Usage on Message Latency
ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Hi-index | 0.00 |
Benchmarking MPI is a contentious subject at best. Micro- benchmarks are used because they are easy to port and, hypothetically, measure an important system characteristic in isolation. The unfortunate reality is that it is remarkably difficult to create a benchmark that is a fair measurement in the context of modern system. Software optimizations and modern processor architecture perform extremely efficiently on benchmarks, where it would not in an application context. This paper explores the challenges faced when benchmarking the network in a modern microprocessor climate and the remarkable impacts on the results that are obtained.