Static worksharing strategies for heterogeneous computers with unrecoverable interruptions

  • Authors:
  • Anne Benoit;Yves Robert;Arnold Rosenberg;Frédéric Vivien

  • Affiliations:
  • Ecole Normale Supérieure de Lyon, 46 allée d'Italie, 69364 Lyon Cedex 07, France;Ecole Normale Supérieure de Lyon, 46 allée d'Italie, 69364 Lyon Cedex 07, France;Colorado State University, Fort Collins, USA;Ecole Normale Supérieure de Lyon, 46 allée d'Italie, 69364 Lyon Cedex 07, France

  • Venue:
  • Parallel Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

One has a large computational workload that is ''divisible'' (its constituent tasks' granularity can be adjusted arbitrarily) and one has access to p remote computers that can assist in computing the workload. How can one best utilize the computers? Two features complicate this question. First, the remote computers may differ from one another in speed. Second, each remote computer is subject to interruptions of known likelihood that kill all work in progress on it. One wishes to orchestrate sharing the workload with the remote computers in a way that maximizes the expected amount of work completed. We deal with three versions of this problem. The simplest version ignores communication costs but allows computers to differ in speed (a heterogeneous set of computers). The other two versions account for communication costs, first with identical remote computers (a homogeneous set of computers), and then with computers that may differ in speed. We provide exact expressions for the optimal work expectation for all three versions of the problem - via explicit closed-form expressions for the first two versions, and via a recurrence that computes this optimal value for the last, most general version.