MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
Achieving Performance Portability with SKaMPI for High-Performance MPI Programs
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
SKaMPI: A Detailed, Accurate MPI Benchmark
Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Reproducible Measurements of MPI Performance Characteristics
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Accurately Measuring MPI Broadcasts in a Computational Grid
HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
International Journal of High Performance Computing Applications
Comparison of MPI benchmark programs on an SGI altix ccNUMA shared memory machine
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Benchmarking one-sided communication with SKaMPI 5
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
An investigation into the performance of reduction algorithms under load imbalance
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
mpicroscope: towards an MPI benchmark tool for performance guideline verification
EuroMPI'12 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface
Hi-index | 0.00 |
This article concentrates on recent work on benchmarking collective operations with SKaMPI. The goal of the SKaMPI project is the creation of a database containing performance measurements of parallel computers in terms of MPI operations. These data support software developers in creating portable and fast programs. Existing algorithms for measuring the timing of collective operations are discussed and a new algorithm is presented, taking into account the differences of local clocks. Results of measurements on a Cray T3E/900 and an IBM RS 6000 SP are presented.