A bridging model for parallel computation
Communications of the ACM
General purpose parallel architectures
Handbook of theoretical computer science (vol. A)
An introduction to parallel algorithms
An introduction to parallel algorithms
General purpose parallel computing
Lectures on parallel computation
Direct bulk-synchronous parallel algorithms
Journal of Parallel and Distributed Computing
Broadcasting on meshes with wormhole routing
Journal of Parallel and Distributed Computing
Communication primitive for BSP computers
Information Processing Letters
Deterministic sorting and randomized median finding on the BSP model
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Lessons learned from implementing BSP
Future Generation Computer Systems - Special issue on HPCN '97
BSPlib: The BSP programming library
Parallel Computing
BSP clusters: high performance, reliable and very low cost
Parallel Computing - Parallel computing on clusters of workstations
A BSP recursive divide and conquer algorithm to compute the inverse of a tridiagonal Matrix
Journal of Parallel and Distributed Computing
The Paderborn University BSP (PUB) Library - Design, Implementation and Performance
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Exchange of Messages of Different Sizes
IRREGULAR '98 Proceedings of the 5th International Symposium on Solving Irregularly Structured Problems in Parallel
Theory, Practice, and a Tool for BSP Performance Prediction
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
1-Optimality of static BSP computations: scheduling independent chains as a case study
Theoretical Computer Science
Parallelism in random access machines
STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing
Towards Automated Performance Prediction in Bulk-Synchronous Parallel Discrete-Event Simulation
SCCC '99 Proceedings of the 19th International Conference of the Chilean Computer Science Society
Profiling large-scale lazy functional programs
Journal of Functional Programming
Bulk synchronous parallel ML: modular implementation and performance prediction
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Hi-index | 0.01 |
A call-graph profiling tool has been designed and implemented to analyse the efficiency of programs written in BSPlib, This tool highlights computation and communication imbalance in parallel programs, exposing portions of program code which are amenable to improvement.A unique feature of this profiler is that it uses the bulk synchronous parallel cost model, thus providing a mechanism for portable and architecture-independent parallel performance tuning. In order to test the capabilities of the model on a real-world example, the performance characteristics of an SQL query processing application are investigated on a number of different parallel architectures.