Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems

  • Authors:
  • I. Riakiotakis;F. M. Ciorba;T. Andronikos;G. Papakonstantinou;A. T. Chronopoulos

  • Affiliations:
  • School of Electrical and Computer Engineering, National Technical University of Athens, 9, Heroon Polytechnioy, Zografou, 15773, Athens, Greece;Center for Information Services and High Performance Computing, Technische Universität Dresden, Zellescher Weg 12/14, Dresden, 01062, Germany;Department of Informatics, Ionian University, 7, Tsirigoti Square, 49100 Corfu, Greece;School of Electrical and Computer Engineering, National Technical University of Athens, 9, Heroon Polytechnioy, Zografou, 15773, Athens, Greece;Department of Computer Science, University of Texas at San Antonio, TX 78249, USA

  • Venue:
  • Concurrency and Computation: Practice & Experience
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Loops are the richest source of parallelism in scientific applications. A large number of loop scheduling schemes have therefore been devised for loops with and without data dependencies (modeled as dependence distance vectors) on heterogeneous clusters. The loops with data dependencies require synchronization via cross-node communication. Synchronization requires fine-tuning to overcome the communication overhead and to yield the best possible overall performance. In this paper, a theoretical model is presented to determine the granularity of synchronization that minimizes the parallel execution time of loops with data dependencies when these are parallelized on heterogeneous systems using dynamic self-scheduling algorithms. New formulas are proposed for estimating the total number of scheduling steps when a threshold for the minimum work assigned to a processor is assumed. The proposed model uses these formulas to determine the synchronization granularity that minimizes the estimated parallel execution time. The accuracy of the proposed model is verified and validated via extensive experiments on a heterogeneous computing system. The results show that the theoretically optimal synchronization granularity, as determined by the proposed model, is very close to the experimentally observed optimal synchronization granularity, with no deviation in the best case, and within 38.4% in the worst case. Copyright © 2012 John Wiley & Sons, Ltd.