A task replication and fair resource management scheme for fault tolerant grids

  • Authors:
  • Antonios Litke;Konstantinos Tserpes;Konstantinos Dolkas;Theodora Varvarigou

  • Affiliations:
  • Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece;Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece;Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece;Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece

  • Venue:
  • EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we study a fault tolerant model for Grid environments based on the task replication concept. The basic idea is to produce and submit to the Grid multiple replicas of a given task, given the fact that the failure probability for each one of them is known a priori. We introduce a scheme for the calculation of the number of replicas for the case of having diverse failure probabilities of each task replica and propose an efficient resource management scheme, based on fair share technique, which handles the task replicas so as to maintain in a fair way the fault tolerance in the Grid. Our study concludes with the presentation of the simulation results which validate the proposed scheme.