Lightweight fault-tolerance mechanism for distributed mobile agent-based monitoring

  • Authors:
  • Jinho Ahn

  • Affiliations:
  • Dept. of Computer Science, Kyonggi University, Suwon, Gyeonggi-do, Republic of Korea

  • Venue:
  • CCNC'09 Proceedings of the 6th IEEE Conference on Consumer Communications and Networking Conference
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Thanks to asynchronous and dynamic natures of mobile agents, a certain number of mobile agent-based monitoring mechanisms have actively been developed to monitor large scale and dynamic distributed networked systems adaptively and efficiently. Among them, some mechanisms attempt to adapt to dynamic changes in various aspects such as network traffic patterns, resource addition and deletion, network topology and so on. However, failures of some domain managers are very critical to providing correct, real-time and efficient monitoring functionality in a large-scale mobile agent-based distributed monitoring system. In this paper, we present a novel fault-tolerance mechanism to have the following advantageous features appropriate for large-scale and dynamic hierarchical mobile agent-based monitoring organizations. It supports fast failure detection functionality with low failure-free overhead by each domain manager transmitting heart-beat messages to its immediate higher-level manager. Also, it minimizes the number of non-faulty monitoring managers affected by failures of domain managers. Moreover, it allows consistent failure detection actions to be performed continuously in case of agent creation, migration and termination, and is able to execute consistent takeover actions even in concurrent failures of domain managers.