Processing moldable tasks on the grid: Late job binding with lightweight user-level overlay

Authors:
J. T. Mocicki;M. Lamanna;M. Bubak;P. M. A. Sloot
Affiliations:
CERN, IT Department, CH-1211 Geneva, Switzerland and Faculty of Sciences, Computational Science, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands;CERN, IT Department, CH-1211 Geneva, Switzerland;Faculty of Sciences, Computational Science, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands and Institute of Computer Science, AGH, al. Mickiewicza 30, 30-059 Krakow, Po ...;Faculty of Sciences, Computational Science, University of Amsterdam, Kruislaan 403, 1098 SJ Amsterdam, The Netherlands
Venue:
Future Generation Computer Systems
Year:
2011

Citing 22
Cited 7

Scheduling Divisible Loads in Parallel and Distributed Systems

Scheduling Divisible Loads in Parallel and Distributed Systems
Condor-G: A Computation Management Agent for Multi-Institutional Grids

Cluster Computing
A Model for Moldable Supercomputer Jobs

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Resource Management Architecture for Metacomputing Systems

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Practical Heterogeneous Placeholder Scheduling in Overlay Metacomputers: Early Experiences

JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
Adaptive Computing on the Grid Using AppLeS

IEEE Transactions on Parallel and Distributed Systems
Master/Slave Computing on the Grid

HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Distributed computing in practice: the Condor experience: Research Articles

Concurrency and Computation: Practice & Experience - Grid Performance
Grids: The top ten questions

Scientific Programming
Grid-based dynamic service overlays

Future Generation Computer Systems
Dynamic workload balancing of parallel applications with user-level scheduling on the Grid

Future Generation Computer Systems
Toward loosely coupled programming on petascale systems

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Towards a general model of the multi-criteria workflow scheduling on the grid

Future Generation Computer Systems
A new paradigm: Data-aware scheduling in grid computing

Future Generation Computer Systems
How are Real Grids Used? The Analysis of Four Grid Traces and Its Implications

GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
A decentralized model for scheduling independent tasks in Federated Grids

Future Generation Computer Systems
Characterization of a computational grid as a complex system

GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing
Introduction to Scheduling

Introduction to Scheduling
Two experiments with application-level quality of service on the EGEE grid

Proceedings of the 2nd workshop on Grids meets autonomic computing
Perspectives on grid computing

Future Generation Computer Systems
Modelling pilot-job applications on production grids

Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Workload analysis of a cluster in a grid environment

JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing

Prediction-based auto-scaling of scientific workflows

Proceedings of the 9th International Workshop on Middleware for Grids, Clouds and e-Science
On importance of service level management in grids

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
On-Line monitoring of service-level agreements in the grid

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Evolution of grid-based services for Diffusion Tensor Image analysis

Future Generation Computer Systems
Monte Carlo simulation on heterogeneous distributed systems: A computing framework with parallel merging and checkpointing strategies

Future Generation Computer Systems
A Grid-Enabled Gateway for Biomedical Data Analysis

Journal of Grid Computing
Characterizing workflow-based activity on a production e-infrastructure using provenance data

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Independent observations and everyday user experience indicate that performance and reliability of large grid infrastructures may suffer from large and unpredictable variations. In this paper we study the impact of the job queuing time on processing of moldable tasks which are commonly found in large-scale production grids. We use the mean value and variance of makespan as the quality of service indicators. We develop a general task processing model to provide a quantitative comparison between two models: early and late job binding in a user-level overlay applied to the EGEE Grid infrastructure. We find that the late-binding model effectively defines a transformation of the distribution of makespan according to the Central Limit Theorem. As demonstrated by Monte Carlo simulations using real job traces, this transformation allows to substantially reduce the mean value and variance of makespan. For certain classes of applications task granularity may be adjusted such that a speedup of an order of magnitude or more may be achieved. We use this result to propose a general strategy for managing access to resources and optimization of workload based on Ganga and DIANE user-level overlay tools. Key features of this approach include: a late-binding scheduler, an ability to interface to a wide range of distributed systems, an ability to extend and customize the system to cover application-specific scheduling and processing patterns and finally, ease of use and lightweight deployment in the user space. We discuss the impact of this approach for some practical applications where efficient processing of many tasks is required to solve scientific problems.