How to solve it: modern heuristics
How to solve it: modern heuristics
IEEE Transactions on Parallel and Distributed Systems
Journal of Parallel and Distributed Computing
Basic Algorithms and Operators
Basic Algorithms and Operators
Gathering at the well: creating communities for grid I/O
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Job Shop Scheduling with Genetic Algorithms
Proceedings of the 1st International Conference on Genetic Algorithms
The ANL/IBM SP Scheduling System
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Heuristics for Scheduling Parameter Sweep Applications in Grid Environments
HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
File and Object Replication in Data Grids
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Computation scheduling and data replication algorithms for data Grids
Grid resource management
Stork: Making Data Placement a First Class Citizen in the Grid
ICDCS '04 Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04)
Locating internet bottlenecks: algorithms, measurements, and implications
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Locating Available Bandwidth Bottlenecks
IEEE Internet Computing
An evaluation of the close-to-files processor and data co-allocation policy in multiclusters
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
What makes workflows work in an opportunistic environment?: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Parallel job scheduling — a status report
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Exploiting replication and data reuse to efficiently schedule data-intensive applications on grids
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Integration of scheduling and replication in data grids
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
A taxonomy of Data Grids for distributed data sharing, management, and processing
ACM Computing Surveys (CSUR)
Practical Scheduling of Bag-of-Tasks Applications on Grids with Dynamic Resilience
IEEE Transactions on Computers
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Data Staging Strategies and Their Impact on the Execution of Scientific Workflows
Proceedings of the second international workshop on Data-aware distributed computing
Scheduling data-intensive workflows on storage constrained resources
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
Correlation aware synchronization for near real time decision support systems
Proceedings of the 13th International Conference on Extending Database Technology
DECO: data replication and execution CO-scheduling for utility grids
ICSOC'06 Proceedings of the 4th international conference on Service-Oriented Computing
Hi-index | 0.01 |
Traditional job schedulers for grid or cluster systems are responsible for assigning incoming jobs to compute nodes in such a way that some evaluative condition is met. Such systems generally take into consideration the availability of compute cycles, queue lengths, and expected job execution times, but they typically do not account directly for data staging and thus miss significant associated opportunities for optimisation. Intuitively, a tighter integration of job scheduling and automated data replication can yield significant advantages due to the potential for optimised, faster access to data and decreased overall execution time. In this paper we consider data placement as a first-class citizen in scheduling and use an optimisation heuristic for generating schedules. We make the following two contributions. First, we identify the necessity for co-scheduling job dispatching and data replication assignments and posit that simultaneously scheduling both is critical for achieving good makespans. Second, we show that deploying a genetic search algorithm to solve the optimal allocation problem has the potential to achieve significant speed-up results versus traditional allocation mechanisms. Through simulation, we show that our algorithm provides on average an approximately 20-45% faster makespan than greedy schedulers.