Studying the impact of synchronization frequency on scheduling tasks with dependencies in heterogeneous systems

Authors:
Theodore Andronikos;Florina M. Ciorba;Ioannis Riakiotakis;George Papakonstantinou;Anthony T. Chronopoulos
Affiliations:
Department of Informatics, Ionian University, Corfu, Greece;Computing Systems Laboratory, Department of Electrical & Computer Engineering, National Technical University of Athens, Greece and Center for Advanced Vehicular Systems, Mississippi State Universi ...;Computing Systems Laboratory, Department of Electrical & Computer Engineering, National Technical University of Athens, Greece;Computing Systems Laboratory, Department of Electrical & Computer Engineering, National Technical University of Athens, Greece;Department of Computer Science, University of Texas at San Antonio, 6900 N. Loop 1604 West, San Antonio, TX 78249, USA
Venue:
Performance Evaluation
Year:
2010

Citing 22
Cited 1

Allocating Independent Subtasks on Parallel Processors

IEEE Transactions on Software Engineering
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
Factoring: a method for scheduling parallel loops

Communications of the ACM
Balancing processor loads and exploiting data locality in N-body simulations

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Optimal orthogonal tiling of 2-D iterations

Journal of Parallel and Distributed Computing
Accurately Selecting Block Size at Runtime in Pipelined Parallel Programs

International Journal of Parallel Programming
Loop tiling for parallelism

Loop tiling for parallelism
Parallel Processing: From Applications to Systems

Parallel Processing: From Applications to Systems
Time-minimal tiling when rise is larger than zero

Parallel Computing
Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers

IEEE Transactions on Parallel and Distributed Systems
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Load Balancing Highly Irregular Computations with the Adaptive Factoring

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Optimal Grain Size Computation for Pipelined Algorithms

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
On the Scalability of Dynamic Scheduling Scientific Applications with Adaptive Weighted Factoring

Cluster Computing
A Class of Loop Self-Scheduling for Heterogeneous Clusters

CLUSTER '01 Proceedings of the 3rd IEEE International Conference on Cluster Computing
Sparse Tiling for Stationary Iterative Methods

International Journal of High Performance Computing Applications
Distributed loop-scheduling schemes for heterogeneous computer systems: Research Articles

Concurrency and Computation: Practice & Experience
History-aware Self-Scheduling

ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
Studying the impact of synchronization frequency on scheduling tasks with dependencies in heterogeneous systems

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Enhancing self-scheduling algorithms via synchronization and weighting

Journal of Parallel and Distributed Computing
Dynamic multi phase scheduling for heterogeneous cluste

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Partitioning and scheduling loops on NOWs

Computer Communications

Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems

Concurrency and Computation: Practice & Experience

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work, we develop and evaluate a theoretical model, which we then use to study the impact of the synchronization frequency on the performance of dynamic self-scheduling algorithms. These algorithms are used to parallelize loops with data dependencies on heterogeneous systems. The proposed model uses a formula to estimate the parallel time as a function of the synchronization frequency. Inter-node communication has been proven to be the dominant factor for the performance degradation of applications containing loops with data dependencies. The synchronization mechanism therefore requires careful fine-tuning in order to give the best possible performance. The proposed model determines the optimal synchronization frequency that results in the minimum parallel time. We use this model to study the impact of the synchronization frequency on the parallel execution of a computational kernel from image processing. For this kernel, the synchronization frequency giving the minimum parallel time predicted by our theoretical model was very close to the synchronization frequency giving the least parallel time in practice. We validate our model by extensive comparisons of the theoretically predicted parallel time and synchronization frequency against those obtained from practical experiments. The comparisons show that the proposed model is highly accurate, its predictions for the optimal synchronization frequency being within 0.0250% of the experimentally optimal synchronization frequency in the best case, and within 0.1750% of the experimentally optimal synchronization frequency in the worst case. Finally, the comparisons show that the proposed model improves on a previously existing model in heterogeneous systems, whereas it gives similar results in homogeneous systems.