Possibilities of Optimal Execution of Parallel Programs Containing Simple and Iterated Loops on Heterogeneous Parallel Computational Systems with Distributed Memory

  • Authors:
  • A. I. Avetisyan;S. S. Gaisaryan;O. I. Samovarov

  • Affiliations:
  • Institute of System Programming, Russian Academy of Sciences, ul. Bol'shaya Kommunisticheskaya 25, Moscow, 109004 Russia arut@ispras.ru;Institute of System Programming, Russian Academy of Sciences, ul. Bol'shaya Kommunisticheskaya 25, Moscow, 109004 Russia ssg@ispras.ru;Institute of System Programming, Russian Academy of Sciences, ul. Bol'shaya Kommunisticheskaya 25, Moscow, 109004 Russia samov@ispras.ru

  • Venue:
  • Programming and Computing Software
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

The problem of load balancing when executing parallel programs on computational systems with distributed memory is currently of great interest. The most general statement of this problem is that for one parallel loop: execution of a heterogeneous loop on a heterogeneous computational system. When stated in this way, the problem is NP-complete even in the case of two nodes, and no acceptable heuristics for solving it are found. Since the development of heuristics is a rather complicated task, we decided to examine the problem by elementary methods in order to refine (and, possibly, simplify) the original problem statement. The results of our studies are discussed in this paper. Estimates of efficiency of parallel loop execution as functions of the number of nodes of homogeneous and heterogeneous parallel computational systems are obtained. These estimates show that the use of heterogeneous parallel systems reduces the efficiency even in the case when their communication subsystems are scaleable (see the definition in Section 4). The use of local networks (heterogeneous parallel computational systems with nonscaleable communication subsystems) for parallel computations with heavy data exchange is not advantageous and is possible only for a small number of nodes (about five). An algorithm of optimal distribution of data between the nodes of a homogeneous or heterogeneous computational system is suggested. Results of numerical experiments substantiate the conclusions obtained.