Dynamic mapping of a class of independent tasks onto heterogeneous computing systems
Journal of Parallel and Distributed Computing - Special issue on software support for distributed computing
Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
The MONARC toolset for simulating large network-distributed processing systems
Proceedings of the 32nd conference on Winter simulation
On Task Scheduling Accuracy: Evaluation Methodology and Results
The Journal of Supercomputing
Basic Concepts and Taxonomy of Dependable and Secure Computing
IEEE Transactions on Dependable and Secure Computing
Scalable Load and Store Processing in Latency Tolerant Processors
Proceedings of the 32nd annual international symposium on Computer Architecture
New grid scheduling and rescheduling methods in the GrADS project
International Journal of Parallel Programming - Special issue: The next generation software program
A low-cost rescheduling policy for efficient mapping of workflows on grid systems
Scientific Programming - AxGrids 2004
Scheduling DAGs on asynchronous processors
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Reactive grid scheduling of DAG applications
PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
A performance study of multiprocessor task scheduling algorithms
The Journal of Supercomputing
Reliable DAG scheduling on grids with rewinding and migration
Proceedings of the first international conference on Networks for grid applications
A Monitoring Architecture for High-Speed Networks in Large Scale Distributed Collaborations
ISPDC '08 Proceedings of the 2008 International Symposium on Parallel and Distributed Computing
Realistic Simulation of Large Scale Distributed Systems using Monitoring
ISPDC '08 Proceedings of the 2008 International Symposium on Parallel and Distributed Computing
Performance Analysis of Grid DAG Scheduling Algorithms using MONARC Simulation Tool
ISPDC '08 Proceedings of the 2008 International Symposium on Parallel and Distributed Computing
Performability analysis method from reliability and availability
Proceedings of the 2009 International Conference on Hybrid Information Technology
Towards Scalable Simulation of Large Scale Distributed Systems
NBIS '09 Proceedings of the 2009 International Conference on Network-Based Information Systems
Computational models and heuristic methods for Grid scheduling problems
Future Generation Computer Systems
NP-complete scheduling problems
Journal of Computer and System Sciences
A compendium of heuristic methods for scheduling in computational grids
IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Locality optimization of stencil applications using data dependency graphs
LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Hi-index | 0.09 |
Scheduling is a key component for performance guarantees in the case of distributed applications running in large scale heterogeneous environments. Another function of the scheduler in such system is the implementation of resilience mechanisms to cope with possible faults. In this case resilience is best approached using dedicated rescheduling mechanisms. The performance of rescheduling is very important in the context of large scale distributed systems and dynamic behavior. The paper proposes a generic rescheduling algorithm. The algorithm can use a wide variety of scheduling heuristics that can be selected by users in advance, depending on the system's structure. The rescheduling component is designed as a middleware service that aims to increase the dependability of large scale distributed systems. The system was evaluated in a real-world implementation for a Grid system. The proposed approach supports fault tolerance and offers an improved mechanism for resource management. The evaluation of the proposed rescheduling algorithm was performed using modeling and simulation. We present experimental results confirming the performance and capabilities of the proposed rescheduling algorithm.