On Optimal Strategies for Cycle-Stealing in Networks of Workstations

Authors:
Sandeep N. Bhatt;Fan R. K. Chung;F. Thomson Leighton;Arnold L. Rosenberg
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Computers
Year:
1997

Citing 11
Cited 13

The V distributed system

Communications of the ACM
The Sprite Network Operating System

Computer
Finding Idle Machines in a Workstation-Based Distributed System

IEEE Transactions on Software Engineering
Towards an architecture-independent analysis of parallel algorithms

SIAM Journal on Computing
Amoeba: A Distributed Operating System for the 1990s

Computer
PVM: a framework for parallel distributed computing

Concurrency: Practice and Experience
Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Making commitments in the face of uncertainty: how to pick a winner almost every time (extended abstract)

STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
In search of clusters (2nd ed.)

In search of clusters (2nd ed.)
Multiprocessing in a network of workstations

Multiprocessing in a network of workstations
Multiprocessing in a network of workstations

Multiprocessing in a network of workstations

Exploiting Fine-Grained Idle Periods in Networks of Workstations

IEEE Transactions on Parallel and Distributed Systems
HiHCoHP: Toward a Realistic Communication Model for Hierarchical HyperClusters of Heterogeneous Processors

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Compiler-Controlled Parallelism-Independent Scheduling for Parallel and Distributed Systems

PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Non-approximability Results for the Hierarchical Communication Problem with a Bounded Number of Clusters

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Guidelines for Data-Parallel Cycle-Stealing in Networks of Workstations

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Dyn-MPI: Supporting MPI on medium-scale, non-dedicated clusters

Journal of Parallel and Distributed Computing
A Parallel Computational Model for Heterogeneous Clusters

IEEE Transactions on Parallel and Distributed Systems
Fault-aware grid scheduling using performance prediction by workload modeling

The Journal of Supercomputing
Contention awareness and fault-tolerant scheduling for precedence constrained tasks in heterogeneous systems

Parallel Computing
A parallel solution for scheduling of real time applications on grid environments

Future Generation Computer Systems
Static worksharing strategies for heterogeneous computers with unrecoverable failures

Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Static worksharing strategies for heterogeneous computers with unrecoverable interruptions

Parallel Computing
Reliability of task graph schedules with transient and fail-stop failures: complexity and algorithms

Journal of Scheduling

Quantified Score

Hi-index	14.98

Visualization

Abstract

We study the parallel scheduling problem for a new modality of parallel computing: having one workstation "steal cycles" from another. We focus on a draconian mode of cycle-stealing, in which the owner of workstation B allows workstation A to take control of B's processor whenever it is idle, with the promise of relinquishing control immediately upon demand. The typically high communication overhead for supplying workstation B with work and receiving its results militates in favor of supplying B with large amounts of work at a time; the risk of losing work in progress when the owner of B reclaims the workstation militates in favor of supplying B with a sequence of small packets of work. The challenge is to balance these two pressures in a way that maximizes the amount of work accomplished.We formulate two models of cycle-stealing. The first attempts to maximize the expected work accomplished during a single episode, when one knows the probability distribution of the return of B's owner. The second attempts to match the productivity of an omniscient cycle-stealer, when one knows how much work that stealer can accomplish. We derive optimal scheduling strategies for sample scenarios within each of these models.Perhaps our most important discovery is the as-yet unexplained coincidence that two quite distinct scenarios lead to almost identical unique optimizing schedules. One scenario falls within our first model; it assumes that the probability of the return of B's owner is uniform across the lifespan of the episode; the optimizing schedule maximizes the expected amount of work accomplished during the episode. The other scenario falls within our second model; it assumes that B's owner will interrupt our cycle-stealing at most once during the lifespan of the opportunity; the optimizing schedule maximizes the amount of work that one is guaranteed to accomplish during the lifespan.