Creating advice-taking reinforcement learners
Machine Learning - Special issue on reinforcement learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Integrated learning for interactive synthetic characters
Proceedings of the 29th annual conference on Computer graphics and interactive techniques
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Cobot in LambdaMOO: An Adaptive Social Statistics Agent
Autonomous Agents and Multi-Agent Systems
Learning tetris using the noisy cross-entropy method
Neural Computation
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Combining manual feedback with subsequent MDP reward signals for reinforcement learning
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Human-assisted neuroevolution through shaping, advice and examples
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Integrating reinforcement learning with human demonstrations of varying ability
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Incentive design for adaptive agents
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Learning from natural instructions
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Reinforcement learning from simultaneous human and MDP reward
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Strategy-Based learning through communication with humans
KES-AMSTA'12 Proceedings of the 6th KES international conference on Agent and Multi-Agent Systems: technologies and applications
Proceedings of the 8th ACM/IEEE international conference on Human-robot interaction
Learning non-myopically from human-generated reward
Proceedings of the 2013 international conference on Intelligent user interfaces
Teaching agents with human feedback: a demonstration of the TAMER framework
Proceedings of the companion publication of the 2013 international conference on Intelligent user interfaces companion
Using informative behavior to increase engagement in the tamer framework
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Shared control of a robot using EEG-based feedback signals
Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
Learning via human feedback in continuous state and action spaces
Applied Intelligence
Learning from natural instructions
Machine Learning
A comparison between a communication-based and a data mining-based learning approach for agents
Intelligent Decision Technologies
Hi-index | 0.00 |
As computational learning agents move into domains that incur real costs (e.g., autonomous driving or financial investment), it will be necessary to learn good policies without numerous high-cost learning trials. One promising approach to reducing sample complexity of learning a task is knowledge transfer from humans to agents. Ideally, methods of transfer should be accessible to anyone with task knowledge, regardless of that person's expertise in programming and AI. This paper focuses on allowing a human trainer to interactively shape an agent's policy via reinforcement signals. Specifically, the paper introduces "Training an Agent Manually via Evaluative Reinforcement," or TAMER, a framework that enables such shaping. Differing from previous approaches to interactive shaping, a TAMER agent models the human's reinforcement and exploits its model by choosing actions expected to be most highly reinforced. Results from two domains demonstrate that lay users can train TAMER agents without defining an environmental reward function (as in an MDP) and indicate that human training within the TAMER framework can reduce sample complexity over autonomous learning algorithms.