Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems

Authors:
I. Riakiotakis;F. M. Ciorba;T. Andronikos;G. Papakonstantinou;A. T. Chronopoulos
Affiliations:
School of Electrical and Computer Engineering, National Technical University of Athens, 9, Heroon Polytechnioy, Zografou, 15773, Athens, Greece;Center for Information Services and High Performance Computing, Technische Universität Dresden, Zellescher Weg 12/14, Dresden, 01062, Germany;Department of Informatics, Ionian University, 7, Tsirigoti Square, 49100 Corfu, Greece;School of Electrical and Computer Engineering, National Technical University of Athens, 9, Heroon Polytechnioy, Zografou, 15773, Athens, Greece;Department of Computer Science, University of Texas at San Antonio, TX 78249, USA
Venue:
Concurrency and Computation: Practice & Experience
Year:
2012

Citing 29
Cited 0

Allocating Independent Subtasks on Parallel Processors

IEEE Transactions on Software Engineering
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
The Influence of Different Workload Descriptions on a Heuristic Load Balancing Scheme

IEEE Transactions on Software Engineering
Factoring: a method for scheduling parallel loops

Communications of the ACM
The definition of dependence distance

ACM Transactions on Programming Languages and Systems (TOPLAS)
Load-sharing in heterogeneous systems via weighted factoring

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Optimal orthogonal tiling of 2-D iterations

Journal of Parallel and Distributed Computing
Accurately Selecting Block Size at Runtime in Pipelined Parallel Programs

International Journal of Parallel Programming
Loop tiling for parallelism

Loop tiling for parallelism
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
Time-minimal tiling when rise is larger than zero

Parallel Computing
Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers

IEEE Transactions on Parallel and Distributed Systems
Runtime Support and Compilation Methods for User-Specified Irregular Data Distributions

IEEE Transactions on Parallel and Distributed Systems
Optimal Grain Size Computation for Pipelined Algorithms

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
On the Scalability of Dynamic Scheduling Scientific Applications with Adaptive Weighted Factoring

Cluster Computing
Loop scheduling for heterogeneity

HPDC '95 Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing
Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers (2nd Edition)

Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers (2nd Edition)
Sparse Tiling for Stationary Iterative Methods

International Journal of High Performance Computing Applications
Distributed loop-scheduling schemes for heterogeneous computer systems: Research Articles

Concurrency and Computation: Practice & Experience
History-aware Self-Scheduling

ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
Enhancing self-scheduling algorithms via synchronization and weighting

Journal of Parallel and Distributed Computing
Scalable loop self-scheduling schemes for heterogeneous clusters

International Journal of Computational Science and Engineering
Derivation of self-scheduling algorithms for heterogeneous distributed computer systems: Application to internet-based grids of computers

Future Generation Computer Systems
Optimal synchronization frequency for dynamic pipelined computations on heterogeneous systems

CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Studying the impact of synchronization frequency on scheduling tasks with dependencies in heterogeneous systems

Performance Evaluation
Loosely-coupled loop scheduling in computational grids

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Dynamic multi phase scheduling for heterogeneous cluste

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Performance-based parallel loop self-scheduling using hybrid OpenMP and MPI programming on multicore SMP clusters

Concurrency and Computation: Practice & Experience
Partitioning and scheduling loops on NOWs

Computer Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Loops are the richest source of parallelism in scientific applications. A large number of loop scheduling schemes have therefore been devised for loops with and without data dependencies (modeled as dependence distance vectors) on heterogeneous clusters. The loops with data dependencies require synchronization via cross-node communication. Synchronization requires fine-tuning to overcome the communication overhead and to yield the best possible overall performance. In this paper, a theoretical model is presented to determine the granularity of synchronization that minimizes the parallel execution time of loops with data dependencies when these are parallelized on heterogeneous systems using dynamic self-scheduling algorithms. New formulas are proposed for estimating the total number of scheduling steps when a threshold for the minimum work assigned to a processor is assumed. The proposed model uses these formulas to determine the synchronization granularity that minimizes the estimated parallel execution time. The accuracy of the proposed model is verified and validated via extensive experiments on a heterogeneous computing system. The results show that the theoretically optimal synchronization granularity, as determined by the proposed model, is very close to the experimentally observed optimal synchronization granularity, with no deviation in the best case, and within 38.4% in the worst case. Copyright © 2012 John Wiley & Sons, Ltd.