Optimistic recovery in distributed systems
ACM Transactions on Computer Systems (TOCS)
Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Agent design patterns: elements of agent application design
AGENTS '98 Proceedings of the second international conference on Autonomous agents
Communications of the ACM
On Coordinated Checkpointing in Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Staggered Consistent Checkpointing
IEEE Transactions on Parallel and Distributed Systems
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Agent UML: a formalism for specifying multiagent software systems
First international workshop, AOSE 2000 on Agent-oriented software engineering
Agent software engineering with role modelling
First international workshop, AOSE 2000 on Agent-oriented software engineering
Distributed Intelligent Agents
IEEE Expert: Intelligent Systems and Their Applications
Message Logging: Pessimistic, Optimistic, Causal, and Optimal
IEEE Transactions on Software Engineering
FANTOMAS: Fault Tolerance for Mobile Agents in Clusters
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Fault-Tolerance by Replication in Distributed Systems
Ada-Europe '96 Proceedings of the 1996 Ada-Europe International Conference on Reliable Software Technologies
ECOOP '98 Workshop ion on Object-Oriented Technology
Security and Reliability in Concordia
HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 7 - Volume 7
On Detecting Global Predicates in Distributed Computations
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Event-B Patterns for Specifying Fault-Tolerance in Multi-agent Interaction
Methods, Models and Tools for Fault Tolerance
Hi-index | 0.00 |
FATMAD is a fault-tolerant multi-agent development framework that is built on top of a mobile agent platform (Jade). FATMAD aims to satisfy the needs of two communities of users: Jade application developers and fault-tolerant protocol developers. Application-level fault tolerance incurs significant development-time cost. FATMAD is based on a generic fault-tolerant protocol whose refinements lead to a broad range of checkpoint and recovery protocols to be used in supporting user applications, thus significantly reducing the development time of fault-tolerant agent applications. This paper introduces the design of FATMAD and explains how fault-tolerant protocol developers can extend FATMAD with additional checkpoint and recovery protocols. The key concepts and features are illustrated through the staggered checkpoint protocol.