Improving Performance via Computational Replication on a Large-Scale Computational Grid

  • Authors:
  • Yaohang Li;Michael Mascagni

  • Affiliations:
  • -;-

  • Venue:
  • CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

High performance computing on a large-scalecomputational grid is complicated by the heterogeneouscomputational capabilities of each node, nodeunavailability, and unreliable network connectivity.Replicating computation on multiple nodes cansignificantly improve performance by reducing taskcompletion time on a grid's dynamic environment. Wedevelop an analytical model to determine the number oftask replicas to meet the performance goals in differentcomputational grid configurations. Furthermore, takingadvantage of the statistical nature of grid-based MonteCarlo applications, we extend the computationalreplication technique to an N-out-of-M schedulingstrategy for grid-based Monte Carlo applications, whichcan potentially form a large category of grid-computingapplications. In addition, we establish a correspondingmodel for the N-out-of-M scheduling mechanism.Simulations are used to validate the computationalreplication models. Our preliminary results show that themodels we use are effective in predicting the requirednumber of replicas to achieve short task completion timewith a given high probability.