Active imitation learning

Authors:
Aaron P. Shon;Deepak Verma;Rajesh P. N. Rao
Affiliations:
Department of Computer Science and Engineering, University of Washington, Seattle, WA;Department of Computer Science and Engineering, University of Washington, Seattle, WA;Department of Computer Science and Engineering, University of Washington, Seattle, WA
Venue:
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Year:
2007

Citing 8
Cited 3

Bayesian Q-learning

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Implicit Imitation in Multiagent Reinforcement Learning

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Accelerating reinforcement learning through imitation

Accelerating reinforcement learning through imitation
Bias and variance in value function estimation

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Apprenticeship learning via inverse reinforcement learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Bayesian approach to imitation in reinforcement learning

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Cooperative learning using advice exchange

Adaptive agents and multi-agent systems
Model based Bayesian exploration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Learning Actions through Imitation and Exploration: Towards Humanoid Robots That Learn from Humans

Creating Brain-Like Intelligence
On Rendering Emotions on a Robotic Face

Proceedings of Conference on Advances In Robotics
Prediction from expert demonstrations for safe tele-surgery

International Journal of Automation and Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Imitation learning, also called learning by watching or programming by demonstration, has emerged as a means of accelerating many reinforcement learning tasks. Previous work has shown the value of imitation in domains where a single mentor demonstrates execution of a known optimal policy for the benefit of a learning agent. We consider the more general scenario of learning from mentors who are themselves agents seeking to maximize their own rewards. We propose a new algorithm based on the concept of transferable utility for ensuring that an observer agent can learn efficiently in the context of a selfish, not necessarily helpful, mentor. We also address the questions of when an imitative agent should request help from a mentor, and when the mentor can be expected to acknowledge a request for help. In analogy with other types of active learning, we call the proposed approach active imitation learning.