Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Communications of the ACM
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
An agent-based framework for interoperability
Software agents
A Communication-Induced Checkpointing Protocol that Ensures Rollback-Dependency Trackability
FTCS '97 Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
The Adaptive Agent Architecture: Achieving Fault-Tolerance Using Persistent Broker Teams
ICMAS '00 Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)
A Taxonomy of Middle-Agents for the Internet
ICMAS '00 Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)
NAP: Practical Fault-Tolerance for Itinerant Computations
ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
A Hybrid Fault-Tolerant Scheme Based on Checkpointing in MASs
ICOIN '02 Revised Papers from the International Conference on Information Networking, Wireless Communications Technologies and Network Applications-Part II
Hi-index | 0.00 |
The paper introduces a fault-tolerant scheme of MAS (Multi-Agent System) for worker agents using the checkpointing and rollback recovery mechanism. To discuss the fault-tolerance of working agents in MAS, we consider the extended MAS model based on task delegation and observation which can be independent of node failures. In the proposed MAS, to preserve global consistency, the facilitators maintain a task plan in a stable storage by using checkpoints taken when either completing a task plan or receiving a response of a subtask within the task plan. In this paper, we present a fault-tolerant scheme which takes over blocked worker agent problem of MAS by using communication-induced checkpointing.