IEEE Transactions on Software Engineering
A Monitoring Sensor Management System for Grid Environments
Cluster Computing
Software—Practice & Experience
An agent-based adaptive monitoring system
PRIMA'06 Proceedings of the 9th Pacific Rim international conference on Agent Computing and Multi-Agent Systems
Delegated agents for network management
IEEE Communications Magazine
Exploiting agent mobility for large-scale network monitoring
IEEE Network: The Magazine of Global Internetworking
Atomic mobile agent group communication
CCNC'10 Proceedings of the 7th IEEE conference on Consumer communications and networking conference
Hi-index | 0.00 |
Thanks to asynchronous and dynamic natures of mobile agents, a certain number of mobile agent-based monitoring mechanisms have actively been developed to monitor large scale and dynamic distributed networked systems adaptively and efficiently. Among them, some mechanisms attempt to adapt to dynamic changes in various aspects such as network traffic patterns, resource addition and deletion, network topology and so on. However, failures of some domain managers are very critical to providing correct, real-time and efficient monitoring functionality in a large-scale mobile agent-based distributed monitoring system. In this paper, we present a novel fault-tolerance mechanism to have the following advantageous features appropriate for large-scale and dynamic hierarchical mobile agent-based monitoring organizations. It supports fast failure detection functionality with low failure-free overhead by each domain manager transmitting heart-beat messages to its immediate higher-level manager. Also, it minimizes the number of non-faulty monitoring managers affected by failures of domain managers. Moreover, it allows consistent failure detection actions to be performed continuously in case of agent creation, migration and termination, and is able to execute consistent takeover actions even in concurrent failures of domain managers.