Statistical Properties of Task Running Times in a Global-Scale Grid Environment

  • Authors:
  • Menno Dobber;Rob van der Mei;Ger Koole

  • Affiliations:
  • Vrije Universiteit, The Netherlands;CWI and Vrije Universiteit, The Netherlands;Vrije Universiteit, The Netherlands

  • Venue:
  • CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Grid computing technology connects globally distributed processors to develop an immense source of computing power, which enables us to run applications in parallel that would take orders of magnitude more time on a single processor. Key characteristics of a global-scale grid are the strong burstiness in the amount of load on the resources and on the network capacities, and the fact that processors may be appended to or removed from the grid at any time. To cope with these characteristics, it is essential to develop techniques that make applications robust against the dynamics of the grid environment. For these techniques to be effective, it is important to have an understanding of the statistical properties of the dynamics of a grid environment. Today, however, the statistical properties of the dynamic behavior of real global-scale grid environments are not well understood. Our main focus is on highly CPU-intensive grid applications that require huge amounts of processor power for running tasks. Motivated by this, we have performed extensive measurements in a real, global-scale grid environment to study the statistical properties of the running times of tasks on processors. We observe (1) a strong burstiness of the running times over different time scales, (2) a strong heterogeneity of the running-time characteristics among the different hosts, (3) a strong heterogeneity of the running-time characteristics for the same host over different time intervals, and (4) the occurrence of sudden level-switches in the running times, amongst others. These observations are used to develop effective techniques for the prediction of running times. They can be used to develop effective control schemes for robust grid applications.