Improving speedup and response times by replicating parallel programs on a SNOW

  • Authors:
  • Gaurav D. Ghare;Scott T. Leutenegger

  • Affiliations:
  • Department of Computer Science, University of Denver, Denver, CO;Department of Computer Science, University of Denver, Denver, CO

  • Venue:
  • JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Idle computation cycles of a shared network of workstations are increasingly being used to run batch parallel programs. For one common paradigm, the batch program task running on an idle workstation is preempted when the owner reclaims the workstation. This owner interference has a considerable impact on the execution time of a batch program, especially in the case of large parallel programs. Replication of batch program tasks has been used to reduce the impact of owner interference. We show analytically that replication can significantly improve parallel program speedup. Perhaps surprisingly, replication can also improve efficiency for certain workloads. We present analysis to quantify the amount of speedup and efficiency improvement. Furthermore, we provide analysis to help determine whether extra available workstations should be used for increasing job parallelism or for task replication.