Modeling job arrivals in a data-intensive grid

  • Authors:
  • Hui Li;Michael Muskulus;Lex Wolters

  • Affiliations:
  • Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands;Mathematical Institute, Leiden University, Leiden, The Netherlands;Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands

  • Venue:
  • JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present an initial analysis of job arrivals in a production data-intensive Grid and investigate several traffic models to characterize the interarrival time processes. Our analysis focuses on the heavy-tail behavior and autocorrelation structures, and the modeling is carried out at three different levels: Grid, Virtual Organization (VO), and region. A set of m-state Markov modulated Poisson processes (MMPP) is investigated, while Poisson processes and hyperexponential renewal processes are evaluated for comparison studies. We apply the transportation distance metric from dynamical systems theory to further characterize the differences between the data trace and the simulated time series, and estimate errors by bootstrapping. The experimental results show that MMPPs with a certain number of states are successful to a certain extent in simulating the job traffic at different levels, fitting both the interarrival time distribution and the autocorrelation function. However, MMPPs are not able to match the autocorrelations for certain VOs, in which strong deterministic semi-periodic patterns are observed. These patterns are further characterized using different representations. Future work is needed to model both deterministic and stochastic components in order to better capture the correlation structure in the series.