IEEE Transactions on Parallel and Distributed Systems
Tuning the performance of I/O-intensive parallel applications
Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
Heuristics for Scheduling I/O Operations
IEEE Transactions on Parallel and Distributed Systems
The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Future Generation Computer Systems - Special issue on metacomputing
Dynamic mapping of a class of independent tasks onto heterogeneous computing systems
Journal of Parallel and Distributed Computing - Special issue on software support for distributed computing
Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
The MONARC toolset for simulating large network-distributed processing systems
Proceedings of the 32nd conference on Winter simulation
Journal of Parallel and Distributed Computing
Introduction to Algorithms
Simulation of Dynamic Grid Replication Strategies in OptorSim
GRID '02 Proceedings of the Third International Workshop on Grid Computing
Proceedings of the Second International Conference on Data Engineering
GridLab: a grid application toolkit and testbed
Future Generation Computer Systems - Grid computing: Towards a new computing infrastructure
Simgrid: A Toolkit for the Simulation of Application Scheduling
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Chameleon: A Resource Scheduler in A Data Grid Environment
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Dynamic Matching and Scheduling of a Class of Independent Tasks onto Heterogeneous Computing Systems
HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
Heuristics for Scheduling Parameter Sweep Applications in Grid Environments
HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
MySRB & SRB: Components of a Data Grid
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
On the relationship between file sizes, transport protocols, and self-similar network traffic
ICNP '96 Proceedings of the 1996 International Conference on Network Protocols (ICNP '96)
Evaluating Scheduling and Replica Optimisation Strategies in OptorSim
GRID '03 Proceedings of the 4th International Workshop on Grid Computing
A grid service broker for scheduling distributed data-oriented applications on global grids
MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
Software—Practice & Experience
Parallel and Distributed Astronomical Data Analysis on Grid Datafarm
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
The Grid2003 Production Grid: Principles and Practice
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Graph theory: An algorithmic approach (Computer science and applied mathematics)
Graph theory: An algorithmic approach (Computer science and applied mathematics)
An evaluation of the close-to-files processor and data co-allocation policy in multiclusters
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Scheduling workflow applications on processors with different capabilities
Future Generation Computer Systems - Collaborative and learning applications of grid technology
Task scheduling strategies for workflow-based applications in grids
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
A hypergraph partitioning based approach for scheduling of tasks with batch-shared I/O
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Selfish grid computing: game-theoretic modeling and NAS performance results
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Non-cooperative, semi-cooperative, and cooperative games-based grid resource allocation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A deadline and budget constrained scheduling algorithm for escience applications on data grids
ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing
Queuing model based on scheduling strategies affect local network services
CIS'09 Proceedings of the international conference on Computational and information science 2009
A data placement strategy in scientific cloud workflows
Future Generation Computer Systems
Network-aware meta-scheduling in advance with autonomous self-tuning system
Future Generation Computer Systems
A survey on grid task scheduling
International Journal of Computer Applications in Technology
Resource scheduling methods for query optimization in data grid systems
ADBIS'11 Proceedings of the 15th international conference on Advances in databases and information systems
A PTS-PGATS based approach for data-intensive scheduling in data grids
Frontiers of Computer Science in China
Information Sciences: an International Journal
Hi-index | 0.01 |
Data-intensive Grid applications need access to large data sets that may each be replicated on different resources. Minimizing the overhead of transferring these data sets to the resources where the applications are executed requires that appropriate computational and data resources be selected. In this paper, we consider the problem of scheduling an application composed of a set of independent tasks, each of which requires multiple data sets that are each replicated on multiple resources. We break this problem into two parts: one, to match each task (or job) to one compute resource for executing the job and one storage resource each for accessing each data set required by the job and two, to assign the set of tasks to the selected resources. We model the first part as an instance of the well-known Set Covering Problem (SCP) and apply a known heuristic for SCP to match jobs to resources. The second part is tackled by extending existing MinMin and Sufferage algorithms to schedule the set of distributed data-intensive tasks. Through simulation, we experimentally compare the SCP-based matching heuristic to others in conjunction with the task scheduling algorithms and present the results.