Processing moldable tasks on the grid: Late job binding with lightweight user-level overlay

  • Authors:
  • J. T. Mocicki;M. Lamanna;M. Bubak;P. M. A. Sloot

  • Affiliations:
  • CERN, IT Department, CH-1211 Geneva, Switzerland and Faculty of Sciences, Computational Science, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands;CERN, IT Department, CH-1211 Geneva, Switzerland;Faculty of Sciences, Computational Science, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands and Institute of Computer Science, AGH, al. Mickiewicza 30, 30-059 Krakow, Po ...;Faculty of Sciences, Computational Science, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Independent observations and everyday user experience indicate that performance and reliability of large grid infrastructures may suffer from large and unpredictable variations. In this paper we study the impact of the job queuing time on processing of moldable tasks which are commonly found in large-scale production grids. We use the mean value and variance of makespan as the quality of service indicators. We develop a general task processing model to provide a quantitative comparison between two models: early and late job binding in a user-level overlay applied to the EGEE Grid infrastructure. We find that the late-binding model effectively defines a transformation of the distribution of makespan according to the Central Limit Theorem. As demonstrated by Monte Carlo simulations using real job traces, this transformation allows to substantially reduce the mean value and variance of makespan. For certain classes of applications task granularity may be adjusted such that a speedup of an order of magnitude or more may be achieved. We use this result to propose a general strategy for managing access to resources and optimization of workload based on Ganga and DIANE user-level overlay tools. Key features of this approach include: a late-binding scheduler, an ability to interface to a wide range of distributed systems, an ability to extend and customize the system to cover application-specific scheduling and processing patterns and finally, ease of use and lightweight deployment in the user space. We discuss the impact of this approach for some practical applications where efficient processing of many tasks is required to solve scientific problems.