Abstract policy evaluation for reactive agents

Authors:
Krysia Broda;Christopher John Hogger
Affiliations:
Department of Computing, Imperial College London, London, UK;Department of Computing, Imperial College London, London, UK
Venue:
SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Year:
2005

Citing 9
Cited 0

Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
From logic programming towards multi-agent systems

Annals of Mathematics and Artificial Intelligence
RL-TOPS: An Architecture for Modularity and Re-Use in Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Teleo-reactive programs for agent control

Journal of Artificial Intelligence Research
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Evolving hierarchical and recursive teleo-reactive programs through genetic programming

EuroGP'03 Proceedings of the 6th European conference on Genetic programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a method for constructing and evaluating teleo-reactive policies for one or more agents, based upon discounted-reward evaluation of policy-restricted subgraphs of complete situation-graphs. The combinatorial burden that would potentially ensue from state-perception associations can be ameliorated by suitable use of abstractions and empirical simulation results indicate that the method affords a good degree of scalability and predictive power. The paper formally analyses the predictive quality of two different abstractions, one for applications involving several agents and one for applications with large numbers of perceptions. Sufficient conditions for reasonable predictive quality are given.