Recognizing internal states of other agents to anticipate and coordinate interactions

Authors:
Filipo Studzinski Perotto
Affiliations:
Constructivist Artificial Intelligence Research Group, Toulouse, France
Venue:
EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Year:
2011

Citing 24
Cited 0

A model for reasoning about persistence and causation

Computational Intelligence
Made-up minds: a constructivist approach to artificial intelligence

Made-up minds: a constructivist approach to artificial intelligence
Technical Note: \cal Q-Learning

Machine Learning
Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Causality: models, reasoning, and inference

Causality: models, reasoning, and inference
Stochastic dynamic programming with factored representations

Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
A causal approach to hierarchical decomposition of factored MDPs

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning the structure of Factored Markov Decision Processes in reinforcement learning problems

ICML '06 Proceedings of the 23rd international conference on Machine learning
Looping suffix tree-based inference of partially observable hidden state

ICML '06 Proceedings of the 23rd international conference on Machine learning
Bayesian Networks and Decision Graphs

Bayesian Networks and Decision Graphs
Efficient structure learning in factored-state MDPs

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Symbolic heuristic search value iteration for factored POMDPs

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
A dynamical systems perspective on agent-environment interaction

Artificial Intelligence
Computing optimal policies for partially observable decision processes using compact representations

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Reinforcement learning with perceptual aliasing: the perceptual distinctions approach

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching the space of finite policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Model-based online learning of POMDPs

ECML'05 Proceedings of the 16th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In multi-agent systems, anticipating the behavior of other agents constitutes a difficult problem. In this paper we present the case where a cognitive agent is inserted into an unknown environment composed of different kinds of other objects and agents; our cognitive agent needs to incrementally learn a model of the environment dynamics, doing it only from its interaction experience; the learned model can then be used to define a policy of actions. It is relatively easy to do so when the agent interacts with static objects, with simple mobile objects, or with trivial reactive agents; however, when the agent deals with other complex agents that may change their behaviors according to some non-directly observable internal properties (like emotional or intentional states), the construction of a model becomes significantly harder. The complete system can be described as a Factored and Partially Observable Markov Decision Process (FPOMDP); our agent implements the Constructivist Anticipatory Learning Mechanism (CALM) algorithm, and the experiment (called mept) shows that the induction of non-observable variables enable the agent to learn a deterministic model of most of the system events (if it represents a well-structured universe), allowing it to anticipate other agents actions and to adapt to them, even if some interactions appear as non-deterministic in a first sight.