Pro-active failure handling mechanisms for scheduling in grid computing environments

  • Authors:
  • B. T. Benjamin Khoo;Bharadwaj Veeravalli

  • Affiliations:
  • National University of Singapore, Department of Electrical and Computer Engineering, Singapore;National University of Singapore, Department of Electrical and Computer Engineering, Singapore

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we consider designing pro-active failure handling strategies for grid environments. These strategies estimate the availability of resources in the Grid, and also preemptively calculate the expected long term capacity of the Grid. Using these strategies, we create modified versions of the backfill and replication algorithms to include all three pro-active strategies to ascertain each of their effectiveness in the prevention of job failures during execution. Also, we extend our earlier work on a co-ordinate based allocation strategy. The extended algorithm also shows continual improvement when operating under the same execution environment. In our experiments, we compare these enhanced algorithms to their original forms, and show that pro-active failure handling is able to, in some cases, avoid all job failures during execution. Also, we show that NSA provides the best balance of enhanced throughput and job failures during execution of the algorithms we have considered.