Computer
The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Journal of Parallel and Distributed Computing
Computational and data Grids in large-scale science and engineering
Future Generation Computer Systems - Grid computing: Towards a new computing infrastructure
Heuristics for Scheduling Parameter Sweep Applications in Grid Environments
HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Integrating Trust into Grid Resource Management Systems
ICPP '02 Proceedings of the 2002 International Conference on Parallel Processing
Security Implications of Typical Grid Computing Usage Scenarios
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Competitive proportional resource allocation policy for computational grid
Future Generation Computer Systems - Special issue: Computational science of lattice Boltzmann modelling
Security-Driven Heuristics and A Fast Genetic Algorithm for Trusted Grid Job Scheduling
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Heuristic scheduling for bag-of-tasks applications in combination with QoS in the computational grid
Future Generation Computer Systems - Special issue: Advanced grid technologies
Adaptive grid job scheduling with genetic algorithms
Future Generation Computer Systems
The Anatomy of the Grid: Enabling Scalable Virtual Organizations
International Journal of High Performance Computing Applications
The impact of data replication on job scheduling performance in the Data Grid
Future Generation Computer Systems
A higher order estimate of the optimum checkpoint interval for restart dumps
Future Generation Computer Systems
Risk-Resilient Heuristics and Genetic Algorithms for Security-Assured Grid Job Scheduling
IEEE Transactions on Computers
PGGA: a predictable and grouped genetic algorithm for job scheduling
Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
GridX1: A Canadian computational grid
Future Generation Computer Systems
Efficient task replication and management for adaptive fault tolerance in mobile Grid environments
Future Generation Computer Systems - Special section: Information engineering and enterprise architecture in distributed computing environments
Job scheduling and data replication on data grids
Future Generation Computer Systems
Practical Scheduling of Bag-of-Tasks Applications on Grids with Dynamic Resilience
IEEE Transactions on Computers
Resource-Aware Distributed Scheduling Strategies for Large-Scale Computational Cluster/Grid Systems
IEEE Transactions on Parallel and Distributed Systems
Fair Scheduling Algorithms in Grids
IEEE Transactions on Parallel and Distributed Systems
Efficient reuse of replicated parallel data segments in computational grids
Future Generation Computer Systems
Incentive-Based Scheduling for Market-Like Computational Grids
IEEE Transactions on Parallel and Distributed Systems
An ant algorithm for balanced job scheduling in grids
Future Generation Computer Systems
Resource allocation on computational grids using a utility model and the knapsack problem
Future Generation Computer Systems
Scheduling CPU-Intensive Grid Applications Using Partial Information
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
An efficient adaptive scheduling policy for high-performance computing
Future Generation Computer Systems
An Adaptive Scheduling Algorithm for Scheduling Tasks in Computational Grid
GCC '08 Proceedings of the 2008 Seventh International Conference on Grid and Cooperative Computing
A Heuristic on Job Scheduling in Grid Computing Environment
GCC '08 Proceedings of the 2008 Seventh International Conference on Grid and Cooperative Computing
A new paradigm: Data-aware scheduling in grid computing
Future Generation Computer Systems
Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids
IEEE Transactions on Parallel and Distributed Systems
A parallel solution for scheduling of real time applications on grid environments
Future Generation Computer Systems
Network-aware scheduling for real-time execution support in data-intensive optical Grids
Future Generation Computer Systems
A decentralized model for scheduling independent tasks in Federated Grids
Future Generation Computer Systems
Toward a fully decentralized algorithm for multiple bag-of-tasks application scheduling on grids
GRID '08 Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing
International Journal of Applied Mathematics and Computer Science - SPECIAL SECTION: Efficient Resource Management for Grid-Enabled Applications
Weight-balanced security-aware scheduling for real-time computational grid
International Journal of Grid and Utility Computing
Computers & Mathematics with Applications
Information Sciences: an International Journal
Multi-Criteria Job Scheduling in Grid Using an Accelerated Genetic Algorithm
Journal of Grid Computing
Security-aware scheduling model for computational grid
Concurrency and Computation: Practice & Experience
Security Driven Scheduling Model for Computational Grid Using NSGA-II
Journal of Grid Computing
Security, energy, and performance-aware resource allocation mechanisms for computational grids
Future Generation Computer Systems
Hi-index | 0.00 |
All existing fault-tolerance job scheduling algorithms for computational grids were proposed under the assumption that all sites apply the same fault-tolerance strategy. They all ignored that each grid site may have its own fault-tolerance strategy because each site is itself an autonomous domain. In fact, it is very common that there are multiple fault-tolerance strategies adopted at the same time in a large-scale computational grid. Various fault-tolerance strategies may have different hardware and software requirements. For instance, if a grid site employs the job checkpointing mechanism, each computation node must have the following ability. Periodically, the computational node transmits the transient state of the job execution to the server. If a job fails, it will migrate to another computational node and resume from the last stored checkpoint. Therefore, in this paper we propose a genetic algorithm for job scheduling to address the heterogeneity of fault-tolerance mechanisms problem in a computational grid. We assume that the system supports four kinds fault-tolerance mechanisms, including the job retry, the job migration without checkpointing, the job migration with checkpointing, and the job replication mechanisms. Because each fault-tolerance mechanism has different requirements for gene encoding, we also propose a new chromosome encoding approach to integrate the four kinds of mechanisms in a chromosome. The risk nature of the grid environment is also taken into account in the algorithm. The risk relationship between jobs and nodes are defined by the security demand and the trust level. Simulation results show that our algorithm has shorter makespan and more excellent efficiencies on improving the job failure rate than the Min-Min and sufferage algorithms.