A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution

  • Authors:
  • Yeh-Ching Chung;Ching-Hsien Hsu;Sheng-Wen Bai

  • Affiliations:
  • Feng Chia Univ., Taichung, Taiwan, China;Feng Chia Univ., Taichung, Taiwan, China;Feng Chia Univ., Taichung, Taiwan, China

  • Venue:
  • IEEE Transactions on Parallel and Distributed Systems
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. Since it is performed at run-time, there is a performance trade-off between the efficiency of the new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present a basic-cycle calculation technique to efficiently perform BLOCK-CYCLIC(s) to BLOCK-CYCLIC(t) redistribution. The main idea of the basic-cycle calculation technique is, first, to develop closed forms for computing source/destination processors of some specific array elements in a basic-cycle, which is defined as lcm(s, t)/gcd(s, t). These closed forms are then used to efficiently determine the communication sets of a basic-cycle. From the source/destination processor/data sets of a basic-cycle, we can efficiently perform a BLOCK-CYCLIC(s) to BLOCK-CYCLIC(t) redistribution. To evaluate the performance of the basic-cycle calculation technique, we have implemented this technique on an IBM SP2 parallel machine, along with the PITFALLS method and the multiphase method. The cost models for these three methods are also presented. The experimental results show that the basic-cycle calculation technique outperforms the PITFALLS method and the multiphase method for most test samples.