A dynamic rescheduling algorithm for resource management in large scale dependable distributed systems

  • Authors:
  • Alexandra Olteanu;Florin Pop;Ciprian Dobre;Valentin Cristea

  • Affiliations:
  • -;-;-;-

  • Venue:
  • Computers & Mathematics with Applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.09

Visualization

Abstract

Scheduling is a key component for performance guarantees in the case of distributed applications running in large scale heterogeneous environments. Another function of the scheduler in such system is the implementation of resilience mechanisms to cope with possible faults. In this case resilience is best approached using dedicated rescheduling mechanisms. The performance of rescheduling is very important in the context of large scale distributed systems and dynamic behavior. The paper proposes a generic rescheduling algorithm. The algorithm can use a wide variety of scheduling heuristics that can be selected by users in advance, depending on the system's structure. The rescheduling component is designed as a middleware service that aims to increase the dependability of large scale distributed systems. The system was evaluated in a real-world implementation for a Grid system. The proposed approach supports fault tolerance and offers an improved mechanism for resource management. The evaluation of the proposed rescheduling algorithm was performed using modeling and simulation. We present experimental results confirming the performance and capabilities of the proposed rescheduling algorithm.