Modelling pilot-job applications on production grids

  • Authors:
  • Tristan Glatard;Sorina Camarasu-Pop

  • Affiliations:
  • University of Lyon, CNRS, INSERM, CREATIS, Villeurbanne, France;University of Lyon, CNRS, INSERM, CREATIS, Villeurbanne, France

  • Venue:
  • Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pilot-job systems have emerged as a computation paradigm to cope with heterogeneity of production grids, greatly improving fault ratios and latency. Tools like DIANE, WISDOM-II, ToPoS and Condor glideIns are now being widely adopted to conduct large-scale experiments on such platforms. However, a model of pilot-job applications is still lacking, making it difficult to determine submission parameters such as the number of pilots to submit to achieve a given performance level. The variability of production conditions and the heterogeneity of the underlying middleware and infrastructure further complicates this issue. This paper presents a performance model for pilot-job applications running on production grids. Based on a probabilistic modelling, we derive statistics about the number of available pilots along time and the makespan of the application given the number of submitted pilots. Results obtained on a radiotherapy application running on the EGEE production grid show that the model is accurate enough to correctly describe the behavior of the application, setting the basis for further optimization strategies.