Towards fault-tolerant massively multiagent systems

Authors:
Zahia Guessoum;Jean-Pierre Briot;Nora Faci
Affiliations:
LIP6, Université Pierre et Marie Curie (Paris 6), Paris, France;LIP6, Université Pierre et Marie Curie (Paris 6), Paris, France;MODECO-CReSTIC – IUT de Reims, Reims Cedex 2, France
Venue:
MMAS'04 Proceedings of the First international conference on Massively Multi-Agent Systems
Year:
2004

Citing 10
Cited 3

Horus: a flexible group communication system

Communications of the ACM
Representing agent interaction protocols in UML

First international workshop, AOSE 2000 on Agent-oriented software engineering
Improving fault-tolerance by replicating agents

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2
From Active Objects to Autonomous Agents

IEEE Concurrency
Lessons from Designing and Implementing GARF

OBPDC '95 Selected papers from the Workshop, on Object-Based Parallel and Distributed Computation
An Approach for Providing Mobile Agent Fault Tolerance

MA '98 Proceedings of the Second International Workshop on Mobile Agents
Fault-Tolerant Execution of Mobile Agents

DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Implementation and Performance Evaluation of an Adaptable Failure Detector

DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Monitoring teams by overhearing: a multi-agent plan-recognition approach

Journal of Artificial Intelligence Research
Probabilistically survivable MASs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

On fault tolerance in law-governed multi-agent systems

Proceedings of the 2006 international workshop on Software engineering for large-scale multi-agent systems
On Fault Tolerance in Law-Governed Multi-agent Systems

Software Engineering for Multi-Agent Systems V
Plan-based replication for fault-tolerant multi-agent systems

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

In order to construct and deploy massively multiagent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. In this paper, we discuss the issues and propose an approach for fault-tolerance of massively multiagent systems. The starting idea is the application of replication strategies to agents. As criticality of agents may evolve during the course of computation and problem solving, and as resources are bounded, we need to dynamically and automatically adapt the number of replicas of agents, in order to maximize their reliability and availability. We will describe our approach and related mechanisms for evaluating the criticality of a given agent and how to parameterize it (e.g., number of replicas). We also will report on experiments conducted with our prototype architecture (named DarX).