Reinforcement learning with human teachers: evidence of feedback and guidance with implications for learning performance

Authors:
Andrea L. Thomaz;Cynthia Breazeal
Affiliations:
MIT Media Lab, Cambridge, MA;MIT Media Lab, Cambridge, MA
Venue:
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Year:
2006

Citing 11
Cited 24

Technical Note: \cal Q-Learning

Machine Learning
A teaching method for reinforcement learning

ML92 Proceedings of the ninth international workshop on Machine learning
Collaborative interface agents

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Virtual petz (video session): a hybrid approach to creating autonomous, lifelike dogz and catz

AGENTS '98 Proceedings of the second international conference on Autonomous agents
A social reinforcement learning agent

Proceedings of the fifth international conference on Autonomous agents
Integrated learning for interactive synthetic characters

Proceedings of the 29th annual conference on Computer graphics and interactive techniques
Reinforcement Learning in the Multi-Robot Domain

Autonomous Robots
Less is More: Active Learning with Support Vector Machines

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Natural methods for robot task learning: instructive demonstrations, generalization and practice

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Lifelong Robot Learning

Lifelong Robot Learning
The lumière project: Bayesian user modeling for inferring the goals and needs of software users

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Learning polite behavior with situation models

Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction
Learning about objects with human teachers

Proceedings of the 4th ACM/IEEE international conference on Human robot interaction
How people talk when teaching a robot

Proceedings of the 4th ACM/IEEE international conference on Human robot interaction
A survey of robot learning from demonstration

Robotics and Autonomous Systems
Interactively shaping agents via human reinforcement: the TAMER framework

Proceedings of the fifth international conference on Knowledge capture
Interactive policy learning through confidence-based autonomy

Journal of Artificial Intelligence Research
Teaching a pet-robot to understand user feedback through interactive virtual training tasks

Autonomous Agents and Multi-Agent Systems
Transparent active learning for robots

Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
Inhabitant guidance of smart environments

HCI'07 Proceedings of the 12th international conference on Human-computer interaction: interaction platforms and techniques
Combining manual feedback with subsequent MDP reward signals for reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Using spatial hints to improve policy reuse in a reinforcement learning agent

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
A Human-Robot Collaborative Reinforcement Learning Algorithm

Journal of Intelligent and Robotic Systems
Emotion and reinforcement: affective facial expressions facilitate robot learning

ICMI'06/IJCAI'07 Proceedings of the ICMI 2006 and IJCAI 2007 international conference on Artifical intelligence for human computing
Teaching a robot to perform tasks with voice commands

MICAI'10 Proceedings of the 9th Mexican international conference on Advances in artificial intelligence: Part I
Dynamic reward shaping: training a robot by voice

IBERAMIA'10 Proceedings of the 12th Ibero-American conference on Advances in artificial intelligence
Improving biped walk stability with complementary corrective demonstration

Autonomous Robots
Learning from natural instructions

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Reinforcement learning from simultaneous human and MDP reward

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Game designers training first person shooter bots

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Human-robot cross-training: computational formulation, modeling and evaluation of a human team training strategy

Proceedings of the 8th ACM/IEEE international conference on Human-robot interaction
Using informative behavior to increase engagement in the tamer framework

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Shared control of a robot using EEG-based feedback signals

Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
Learning via human feedback in continuous state and action spaces

Applied Intelligence
Learning from natural instructions

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

As robots become a mass consumer product, they will need to learn new skills by interacting with typical human users. Past approaches have adapted reinforcement learning (RL) to accept a human reward signal; however, we question the implicit assumption that people shall only want to give the learner feedback on its past actions. We present findings from a human user study showing that people use the reward signal not only to provide feedback about past actions, but also to provide future directed rewards to guide subsequent actions. Given this, we made specific modifications to the simulated RL robot to incorporate guidance. We then analyze and evaluate its learning performance in a second user study, and we report significant improvements on several measures. This work demonstrates the importance of understanding the human-teacher/robot-learner system as a whole in order to design algorithms that support how people want to teach while simultaneously improving the robot's learning performance.