A bridging model for parallel computation
Communications of the ACM
BSPlib: The BSP programming library
Parallel Computing
The Paderborn University BSP (PUB) Library - Design, Implementation and Performance
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Performance Prediction of Oblivious BSP Programs
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
h-Relation Models for Current Standard Parallel Platforms
Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
A Portable Programming Interface for Performance Evaluation on Modern Processors
International Journal of High Performance Computing Applications
Predictability of bulk synchronous programs using MPI
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Hi-index | 0.02 |
The accumulated experience indicates that complexity models like LogP or BSP, characterizing the performance of distributed machines through a few parameters, incur in a considerable loss of accuracy. Errors ranges up to 70%. The complexity analysis model presented here still makes use of the BSP concept ofsup erstep, but introduces a few novelties. To cover both oblivious synchronization and group partitioning we have to admit that different processors may finish the same superstep at different times. The other extension recognizes that, even if the numbers ofindividual communication or computation operations in two stages are the same, the actual times for these two stages may differ. These differences are due to the separate nature of the operations or to the particular pattern followed by the messages. A natural proposal is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with the different communications. Unfortunately, to use this approach implies that the parameters evaluation not only depend on the given architecture, but also reflect algorithm characteristics. Such parameter evaluation must be done for every algorithm. This is a heavy task, implying experiment design, timing, statistics and multi-parameter fitting algorithms. Software support is required. We have developed a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.