Using reactive multi-agent systems in simulation and problem solving
Distributed artificial intelligence
Horus: a flexible group communication system
Communications of the ACM
Chameleon: A Software Infrastructure for Adaptive Fault Tolerance
IEEE Transactions on Parallel and Distributed Systems
Replication and fault-tolerance in the ISIS system
Proceedings of the tenth ACM symposium on Operating systems principles
The OpenView Enterprise Management Framework
International Journal of Network Management
Using self-diagnosis to adapt organizational structures
Proceedings of the fifth international conference on Autonomous agents
Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence
Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence
Multi-agent dependence by dependence graphs
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Open protocol design for complex interactions in multi-agent systems
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2
Improving fault-tolerance by replicating agents
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2
ICMAS--95, First International Conference on Multi-Agent Systems: Proceedings, June 12-14, 1995, San Francisco, California
Delta Four: A Generic Architecture for Dependable Distributed Computing
Delta Four: A Generic Architecture for Dependable Distributed Computing
From Active Objects to Autonomous Agents
IEEE Concurrency
Lessons from Designing and Implementing GARF
OBPDC '95 Selected papers from the Workshop, on Object-Based Parallel and Distributed Computation
An Approach for Providing Mobile Agent Fault Tolerance
MA '98 Proceedings of the Second International Workshop on Mobile Agents
Cloning for Intelligent Adaptive Information Agents
Revised Papers from the Second Australian Workshop on Distributed Artificial Intelligence: Multi-Agent Systems: Methodologies and Applications
Fault-Tolerant Execution of Mobile Agents
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
A Sentinel Approach to Fault Handling in Multi-Agent Systems
Revised Papers from the Second Australian Workshop on Distributed Artificial Intelligence: Multi-Agent Systems: Methodologies and Applications
Access control with IBM Tivoli access manager
ACM Transactions on Information and System Security (TISSEC)
Fault Tolerance in Scalable Agent Support Systems: Integrating DARX in the AgentScape Framework
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
A Meta-Model for the Analysis and Design of Organizations in Multi-Agent Systems
ICMAS '98 Proceedings of the 3rd International Conference on Multi Agent Systems
SELMAS '05 Proceedings of the fourth international workshop on Software engineering for large-scale multi-agent systems
DimaX: a fault-tolerant multi-agent platform
Proceedings of the 2006 international workshop on Software engineering for large-scale multi-agent systems
A Predictive Method for Providing Fault Tolerance in Multi-agent Systems
IAT '06 Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology
The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver
IEEE Transactions on Computers
Monitoring teams by overhearing: a multi-agent plan-recognition approach
Journal of Artificial Intelligence Research
Probabilistically survivable MASs
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
ABLE: a toolkit for building multiagent autonomic systems
IBM Systems Journal
Adaptive Replication in Fault-Tolerant Multi-agent Systems
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
International Journal of Grid and High Performance Computing
Hi-index | 0.00 |
Distributed cooperative applications are now increasingly being designed as a set of autonomous entities, named agents, which interact and coordinate (thus named a multi-agent system). Such applications are often very dynamic: new agents can join or leave, they can change roles, strategies, etc. This high dynamicity creates new challenges to the traditional approaches of fault-tolerance. In this paper, we will focus on crash failures, with usual preventive approaches by replication. But, as criticality of agents may evolve during the course of computation and problem solving, static design is not appropriate. Thus we need to dynamically and automatically identify the most critical agents and to adapt their replication strategies (e.g., active or passive, number of replicas), in order to maximize their reliability and their availability. In this paper, we describe a prototype architecture, supporting adaptive replication. We also discuss and compare various control strategies for replication, one using agent roles, and another using inter-agent dependences as types of information to infer and estimate criticality of agents. Experiments and measurements are also reported.