Using informative behavior to increase engagement in the tamer framework

Authors:
Guangliang Li;Hayley Hung;Shimon Whiteson;W. Bradley Knox
Affiliations:
University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands;MMassachusetts Institute of Technology Media Lab, Cambridge, Massachusetts, USA
Venue:
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Year:
2013

Citing 17
Cited 0

Creating advice-taking reinforcement learners

Machine Learning - Special issue on reinforcement learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Integrated learning for interactive synthetic characters

Proceedings of the 29th annual conference on Computer graphics and interactive techniques
Apprenticeship learning via inverse reinforcement learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning tetris using the noisy cross-entropy method

Neural Computation
Learning by demonstration with critique from a human teacher

Proceedings of the ACM/IEEE international conference on Human-robot interaction
A survey of robot learning from demonstration

Robotics and Autonomous Systems
Reinforcement learning with human teachers: evidence of feedback and guidance with implications for learning performance

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Interactively shaping agents via human reinforcement: the TAMER framework

Proceedings of the fifth international conference on Knowledge capture
Interactive policy learning through confidence-based autonomy

Journal of Artificial Intelligence Research
Transparent active learning for robots

Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
Tetris is hard, even to approximate

COCOON'03 Proceedings of the 9th annual international conference on Computing and combinatorics
Combining manual feedback with subsequent MDP reward signals for reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Dynamic reward shaping: training a robot by voice

IBERAMIA'10 Proceedings of the 12th Ibero-American conference on Advances in artificial intelligence
Getting what you measure

Communications of the ACM
Reinforcement learning from simultaneous human and MDP reward

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we address a relatively unexplored aspect of designing agents that learn from human training by investigating how the agent's non-task behavior can elicit human feedback of higher quality and quantity. We use the TAMER framework, which facilitates the training of agents by human-generated reward signals, i.e., judgements of the quality of the agent's actions, as the foundation for our investigation. Then, we propose two new training interfaces to increase active involvement in the training process and thereby improve the agent's task performance. One provides information on the agent's uncertainty, the other on its performance. Our results from a 51-subject user study show that these interfaces can induce the trainers to train longer and give more feedback. The agent's performance, however, increases only in response to the addition of performance-oriented information, not by sharing uncertainty levels. Subsequent analysis of our results suggests that the organizational maxim about human behavior, "you get what you measure" - i.e., sharing metrics with people causes them to focus on maximizing or minimizing those metrics while de-emphasizing other objectives - also applies to the training of agents, providing a powerful guiding principle for human-agent interface design in general.