Efficient hybrid parallelisation of tiled algorithms on SMP clusters

Authors:
Nikolaos Drosinos;Nectarios Koziris
Affiliations:
School of Electrical and Computer Engineering, Computing Systems Laboratory, National Technical University of Athens, Iroon Polytexneioy 9, 15780 Zografou Campus, Athens, Greece.;School of Electrical and Computer Engineering, Computing Systems Laboratory, National Technical University of Athens, Iroon Polytexneioy 9, 15780 Zografou Campus, Athens, Greece
Venue:
International Journal of Computational Science and Engineering
Year:
2009

Citing 17
Cited 0

Reducing data communication overhead for DOACROSS loop nests

ICS '94 Proceedings of the 8th international conference on Supercomputing
A Survey of Recoverable Distributed Shared Virtual Memory Systems

IEEE Transactions on Parallel and Distributed Systems
OpenMP for networks of SMPs

Journal of Parallel and Distributed Computing
Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Terascale spectral element dynamical core for atmospheric general circulation models

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Introduction to Parallel Computing

Introduction to Parallel Computing
A Loop Transformation Theory and an Algorithm to Maximize Parallelism

IEEE Transactions on Parallel and Distributed Systems
On Supernode Transformation with Minimized Total Running Time

IEEE Transactions on Parallel and Distributed Systems
Tiling with limited resources

ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Parallel Performance Study of Monte Carlo Photon Transport Code on Shared-, Distributed-, and Distributed-Shared-Memory Architectures

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Parallel Scientific Computing in C++ and MPI

Parallel Scientific Computing in C++ and MPI
Generalized multipartitioning of multi-dimensional arrays for parallelizing line-sweep computations

Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium
Mapping and Load-Balancing Iterative Computations

IEEE Transactions on Parallel and Distributed Systems
A Novel FDTD Application Featuring OpenMP-MPI Hybrid Parallelization

ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
An Introduction to High Performance Fortran

Scientific Programming
Optimal semi-oblique tiling

IEEE Transactions on Parallel and Distributed Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper emphasises on load balancing issues associated with hybrid parallelisation of tiled algorithms onto SMP clusters. The intrinsic load imbalance in hybrid parallelisation derives from the fact that message passing libraries often provide limited multi-threading support, thus allowing only the master thread to perform inter-node message passing communication. In order to mitigate this effect, we propose a generic method for the application of load balancing on the coarse-grain hybrid model for the appropriate load distribution among the working threads. We investigate both static and dynamic load balancing, and experimentally evaluate three balancing variations against kernel benchmarks.