Scheduling Divisible Loads in Parallel and Distributed Systems
Scheduling Divisible Loads in Parallel and Distributed Systems
Condor-G: A Computation Management Agent for Multi-Institutional Grids
Cluster Computing
A Model for Moldable Supercomputer Jobs
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Resource Management Architecture for Metacomputing Systems
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Practical Heterogeneous Placeholder Scheduling in Overlay Metacomputers: Early Experiences
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
Adaptive Computing on the Grid Using AppLeS
IEEE Transactions on Parallel and Distributed Systems
Master/Slave Computing on the Grid
HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
Scientific Programming
Grid-based dynamic service overlays
Future Generation Computer Systems
Dynamic workload balancing of parallel applications with user-level scheduling on the Grid
Future Generation Computer Systems
Toward loosely coupled programming on petascale systems
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Towards a general model of the multi-criteria workflow scheduling on the grid
Future Generation Computer Systems
A new paradigm: Data-aware scheduling in grid computing
Future Generation Computer Systems
How are Real Grids Used? The Analysis of Four Grid Traces and Its Implications
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
A decentralized model for scheduling independent tasks in Federated Grids
Future Generation Computer Systems
Characterization of a computational grid as a complex system
GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing
Introduction to Scheduling
Two experiments with application-level quality of service on the EGEE grid
Proceedings of the 2nd workshop on Grids meets autonomic computing
Perspectives on grid computing
Future Generation Computer Systems
Modelling pilot-job applications on production grids
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Workload analysis of a cluster in a grid environment
JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
Prediction-based auto-scaling of scientific workflows
Proceedings of the 9th International Workshop on Middleware for Grids, Clouds and e-Science
On importance of service level management in grids
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
On-Line monitoring of service-level agreements in the grid
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Evolution of grid-based services for Diffusion Tensor Image analysis
Future Generation Computer Systems
Future Generation Computer Systems
A Grid-Enabled Gateway for Biomedical Data Analysis
Journal of Grid Computing
Characterizing workflow-based activity on a production e-infrastructure using provenance data
Future Generation Computer Systems
Hi-index | 0.00 |
Independent observations and everyday user experience indicate that performance and reliability of large grid infrastructures may suffer from large and unpredictable variations. In this paper we study the impact of the job queuing time on processing of moldable tasks which are commonly found in large-scale production grids. We use the mean value and variance of makespan as the quality of service indicators. We develop a general task processing model to provide a quantitative comparison between two models: early and late job binding in a user-level overlay applied to the EGEE Grid infrastructure. We find that the late-binding model effectively defines a transformation of the distribution of makespan according to the Central Limit Theorem. As demonstrated by Monte Carlo simulations using real job traces, this transformation allows to substantially reduce the mean value and variance of makespan. For certain classes of applications task granularity may be adjusted such that a speedup of an order of magnitude or more may be achieved. We use this result to propose a general strategy for managing access to resources and optimization of workload based on Ganga and DIANE user-level overlay tools. Key features of this approach include: a late-binding scheduler, an ability to interface to a wide range of distributed systems, an ability to extend and customize the system to cover application-specific scheduling and processing patterns and finally, ease of use and lightweight deployment in the user space. We discuss the impact of this approach for some practical applications where efficient processing of many tasks is required to solve scientific problems.