Computation scheduling and data replication algorithms for data Grids

Authors:
Kavitha Ranganathan;Ian Foster
Affiliations:
Department of Computer Science, The University of Chicago;Department of Computer Science, The University of Chicago and Mathematics and Computer Science Division, Argonne National Laboratory
Venue:
Grid resource management
Year:
2004

Citing 0
Cited 13

Planning spatial workflows to optimize grid performance

Proceedings of the 2006 ACM symposium on Applied computing
Enabling Grid technologies for Planck space mission

Future Generation Computer Systems - Special section: Information engineering and enterprise architecture in distributed computing environments
A method for job scheduling in Grid based on job execution status

Multiagent and Grid Systems
Load distribution of analytical query workloads for database cluster architectures

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
INFORM: integrated flow orchestration and meta-scheduling for managed grid systems

Proceedings of the 2007 ACM/IFIP/USENIX international conference on Middleware companion
QoS-Oriented Reputation-Aware Query Scheduling in Data Grids

Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
A new paradigm: Data-aware scheduling in grid computing

Future Generation Computer Systems
Runtime Estimations, Reputation and Elections for Top Performing Distributed Query Scheduling

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
An opportunistic algorithm for scheduling workflows on grids

VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
FIRE: A File Reunion Based Data Replication Strategy for Data Grids

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
DECO: data replication and execution CO-scheduling for utility grids

ICSOC'06 Proceedings of the 4th international conference on Service-Oriented Computing
Evolving toward the perfect schedule: co-scheduling job assignments and data replication in wide-area systems using a genetic algorithm

JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
Preference---Based Matchmaking of Grid Resources with CP---Nets

Journal of Grid Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data Grids seek to harness geographically distributed resources for large-scale data-intensive problems such as those encountered in high energy physics, bioinformatics, and other disciplines. These problems typically involve numerous, loosely coupled jobs that both access and generate large data sets. Effective scheduling in such environments is challenging, because of a need to address a variety of metrics and constraints (e.g., resource utilization, response time, global and local allocation policies) while dealing with multiple, potentially independent sources of jobs and a large number of storage, compute, and network resources.We describe a scheduling framework that addresses these problems. Within this framework, data movement operations may be either tightly bound to job scheduling decisions or performed by a decoupled, asynchronous process on the basis of observed data access patterns and load. We develop a family of job scheduling and data movement (replication) algorithms and use simulation studies to evaluate various combinations. Our results suggest that while it is necessary to consider the impact of replication on the scheduling strategy, it is not always necessary to couple data movement and computation scheduling. Instead, these two activities can be addressed separately, thus significantly simplifying the design and implementation of the overall Data Grid system.