STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Scheduling with unexpected machine breakdowns
Discrete Applied Mathematics
Optimal Schedules for Cycle-Stealing in a Network of Workstations with a Bag-of-Tasks Workload
IEEE Transactions on Parallel and Distributed Systems
MPI: The Complete Reference
Scheduling Divisible Loads in Parallel and Distributed Systems
Scheduling Divisible Loads in Parallel and Distributed Systems
On Optimal Strategies for Cycle-Stealing in Networks of Workstations
IEEE Transactions on Computers
Efficient collective communication in distributed heterogeneous systems
Journal of Parallel and Distributed Computing
Scheduling Divisible Loads on Star and Tree Networks: Results and Open Problems
IEEE Transactions on Parallel and Distributed Systems
Static strategies forworksharing with unrecoverable interruptions
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
One has a large computational workload that is ''divisible'' (its constituent tasks' granularity can be adjusted arbitrarily) and one has access to p remote computers that can assist in computing the workload. How can one best utilize the computers? Two features complicate this question. First, the remote computers may differ from one another in speed. Second, each remote computer is subject to interruptions of known likelihood that kill all work in progress on it. One wishes to orchestrate sharing the workload with the remote computers in a way that maximizes the expected amount of work completed. We deal with three versions of this problem. The simplest version ignores communication costs but allows computers to differ in speed (a heterogeneous set of computers). The other two versions account for communication costs, first with identical remote computers (a homogeneous set of computers), and then with computers that may differ in speed. We provide exact expressions for the optimal work expectation for all three versions of the problem - via explicit closed-form expressions for the first two versions, and via a recurrence that computes this optimal value for the last, most general version.