Speedup Versus Efficiency in Parallel Systems
IEEE Transactions on Computers
The interaction of parallel and sequential workloads on a network of workstations
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Interfacing Condor and PVM to harness the cycles of workstation clusters
Future Generation Computer Systems - Special issue: resource management in distributed systems
Competitive execution in a distributed environment
Competitive execution in a distributed environment
The utility of exploiting idle workstations for parallel computation
SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Limitations of cycle stealing for parallel processing on a network of homogeneous workstations
Journal of Parallel and Distributed Computing
Optimal Schedules for Cycle-Stealing in a Network of Workstations with a Bag-of-Tasks Workload
IEEE Transactions on Parallel and Distributed Systems
A Case for NOW (Networks of Workstations)
IEEE Micro
Managing Checkpoints for Parallel Programs
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Optimal sharing of bags of tasks in heterogeneous clusters
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Evaluation of Strategies to Reduce the Impact of Machine Reclaim in Cycle-Stealing Environments
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
An Enabling Framework for Master-Worker Applications on the Computational Grid
HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Characterizing resource availability in enterprise desktop grids
Future Generation Computer Systems
Fault-aware grid scheduling using performance prediction by workload modeling
The Journal of Supercomputing
A parallel solution for scheduling of real time applications on grid environments
Future Generation Computer Systems
Effective straggler mitigation: attack of the clones
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
GRASS: trimming stragglers in approximation analytics
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
Idle computation cycles of a shared network of workstations are increasingly being used to run batch parallel programs. For one common paradigm, the batch program task running on an idle workstation is preempted when the owner reclaims the workstation. This owner interference has a considerable impact on the execution time of a batch program, especially in the case of large parallel programs. Replication of batch program tasks has been used to reduce the impact of owner interference. We show analytically that replication can significantly improve parallel program speedup. Perhaps surprisingly, replication can also improve efficiency for certain workloads. We present analysis to quantify the amount of speedup and efficiency improvement. Furthermore, we provide analysis to help determine whether extra available workstations should be used for increasing job parallelism or for task replication.