An object-oriented bulk synchronous parallel library for multicore programming

Authors:
A.N. Yzelman;Rob H. Bisseling
Affiliations:
Mathematical Institute, Utrecht University, P.O. Box 80010, 3508 TA Utrecht, The Netherlands;Mathematical Institute, Utrecht University, P.O. Box 80010, 3508 TA Utrecht, The Netherlands
Venue:
Concurrency and Computation: Practice & Experience
Year:
2012

Citing 15
Cited 1

A bridging model for parallel computation

Communications of the ACM
The bulk-synchronous parallel random access machine

Theoretical Computer Science - Special issue on parallel computing
Improving the memory-system performance of sparse-matrix vector multiplication

IBM Journal of Research and Development
BSPlib: The BSP programming library

Parallel Computing
OpenMP: An Industry-Standard API for Shared-Memory Programming

IEEE Computational Science & Engineering
The Paderborn University BSP (PUB) library

Parallel Computing
Parallel Scientific Computation: A Structured Approach Using BSP and MPI

Parallel Scientific Computation: A Structured Approach Using BSP and MPI
A Two-Dimensional Data Distribution Method for Parallel Sparse Matrix-Vector Multiplication

SIAM Review
BSGP: bulk-synchronous GPU programming

ACM SIGGRAPH 2008 papers
Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)

Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)
A Bridging Model for Multi-core Computing

ESA '08 Proceedings of the 16th annual European symposium on Algorithms
Optimization of sparse matrix-vector multiplication on emerging multicore platforms

Parallel Computing
Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Cache-Oblivious Sparse Matrix-Vector Multiplication by Using Sparse Matrix Partitioning Methods

SIAM Journal on Scientific Computing
Parallel hypergraph partitioning for scientific computing

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Special Issue: Compilers for Parallel Computing (CPC 2010)

Concurrency and Computation: Practice & Experience

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show that the bulk synchronous parallel (BSP) model, originally designed for distributed-memory systems, is also applicable for shared-memory multicore systems and, furthermore, that BSP libraries are useful in scientific computing on these systems. A proof-of-concept MulticoreBSP library has been implemented in Java, and is used to show that BSP algorithms can attain proper speedups on multicore architectures. This library is based on the BSPlib implementation, adapted to an object-oriented setting. In comparison, the number of function primitives is reduced, while the overall design simplicity is improved. We detail applying the BSP model and library on the sparse matrix–vector (SpMV) multiplication problem, and show by performing numerical experiments that the resulting BSP SpMV algorithm attains speedups, in one case reaching a speedup of 3.5 for 4 threads. Whereas not described in detail in this paper, algorithms for the fast Fourier transform and the dense LU decomposition are also investigated; in one case, attaining superlinear speedups of 5 for 4 threads. The predictability of BSP algorithms in the case of the SpMV is also investigated. Copyright © 2011 John Wiley & Sons, Ltd.