Interactively shaping agents via human reinforcement: the TAMER framework

Authors:
W. Bradley Knox;Peter Stone
Affiliations:
University of Texas at Austin, Austin, TX, USA;University of Texas at Austin, Austin, TX, USA
Venue:
Proceedings of the fifth international conference on Knowledge capture
Year:
2009

Citing 8
Cited 16

Creating advice-taking reinforcement learners

Machine Learning - Special issue on reinforcement learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Integrated learning for interactive synthetic characters

Proceedings of the 29th annual conference on Computer graphics and interactive techniques
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Cobot in LambdaMOO: An Adaptive Social Statistics Agent

Autonomous Agents and Multi-Agent Systems
Learning tetris using the noisy cross-entropy method

Neural Computation
Reinforcement learning with human teachers: evidence of feedback and guidance with implications for learning performance

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1

Combining manual feedback with subsequent MDP reward signals for reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Human-assisted neuroevolution through shaping, advice and examples

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Integrating reinforcement learning with human demonstrations of varying ability

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Incentive design for adaptive agents

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Learning from natural instructions

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Reinforcement learning from simultaneous human and MDP reward

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Strategy-Based learning through communication with humans

KES-AMSTA'12 Proceedings of the 6th KES international conference on Agent and Multi-Agent Systems: technologies and applications
Human-robot cross-training: computational formulation, modeling and evaluation of a human team training strategy

Proceedings of the 8th ACM/IEEE international conference on Human-robot interaction
Learning non-myopically from human-generated reward

Proceedings of the 2013 international conference on Intelligent user interfaces
Teaching agents with human feedback: a demonstration of the TAMER framework

Proceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companion
Using informative behavior to increase engagement in the tamer framework

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Shared control of a robot using EEG-based feedback signals

Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
Learning via human feedback in continuous state and action spaces

Applied Intelligence
Learning from natural instructions

Machine Learning
A comparison between a communication-based and a data mining-based learning approach for agents

Intelligent Decision Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

As computational learning agents move into domains that incur real costs (e.g., autonomous driving or financial investment), it will be necessary to learn good policies without numerous high-cost learning trials. One promising approach to reducing sample complexity of learning a task is knowledge transfer from humans to agents. Ideally, methods of transfer should be accessible to anyone with task knowledge, regardless of that person's expertise in programming and AI. This paper focuses on allowing a human trainer to interactively shape an agent's policy via reinforcement signals. Specifically, the paper introduces "Training an Agent Manually via Evaluative Reinforcement," or TAMER, a framework that enables such shaping. Differing from previous approaches to interactive shaping, a TAMER agent models the human's reinforcement and exploits its model by choosing actions expected to be most highly reinforced. Results from two domains demonstrate that lay users can train TAMER agents without defining an environmental reward function (as in an MDP) and indicate that human training within the TAMER framework can reduce sample complexity over autonomous learning algorithms.