Advice generation from observed execution: abstract Markov decision process learning

Authors:
Patrick Riley;Manuela Veloso
Affiliations:
Computer Science Department, Carnegie Mellon University, Pittsburgh, PA;Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
Venue:
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Year:
2004

Citing 10
Cited 1

A formal theory of plan recognition and its implementation

Reasoning about plans
A Bayesian model of plan recognition

Artificial Intelligence
Creating advice-taking reinforcement learners

Machine Learning - Special issue on reinforcement learning
Abstraction and approximate decision-theoretic planning

Artificial Intelligence
Automated assistants to aid humans in understanding team behaviors

AGENTS '00 Proceedings of the fourth international conference on Autonomous agents
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
TTree: Tree-Based State Generalization with Temporally Abstract Actions

Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Automated Advice-Giving Strategies for Scientific Inquiry

ITS '96 Proceedings of the Third International Conference on Intelligent Tutoring Systems
The RoboCup synthetic agent challenge 97

IJCAI'97 Proceedings of the 15th international joint conference on Artifical intelligence - Volume 1
An MDP-based recommender system

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

CommLang: communication for coachable agents

RoboCup 2004

Quantified Score

Hi-index	0.01

Visualization

Abstract

An advising agent, a coach, provides advice to other agents about how to act. In this paper we contribute an advice generation method using observations of agents acting in an environment. Given an abstract state definition and partially specified abstract actions, the algorithm extracts a Markov Chain, infers a Markov Decision Process, and then solves the MDP (given an arbitrary reward signal) to generate advice. We evaluate our work in a simulated robot soccer environment and experimental results show improved agent performance when using the advice generated from the MDP for both a sub-task and the full soccer game.