Practical Issues in Temporal Difference Learning
Machine Learning
TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
Temporal difference learning and TD-Gammon
Communications of the ACM
Learning to Play Chess Using Temporal Differences
Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
SIAM Journal on Control and Optimization
Apprenticeship learning via inverse reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Preface: Innovations in agent collaboration, cooperation and Teaming, Part 2
Journal of Network and Computer Applications
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Interactively shaping agents via human reinforcement: the TAMER framework
Proceedings of the fifth international conference on Knowledge capture
Natural actor-critic algorithms
Automatica (Journal of IFAC)
A reinforcement learning approach to job-shop scheduling
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Some studies in machine learning using the game of checkers
IBM Journal of Research and Development
Combining manual feedback with subsequent MDP reward signals for reinforcement learning
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Combining active learning and reactive control for robot grasping
Robotics and Autonomous Systems
Hessian matrix distribution for Bayesian policy gradient reinforcement learning
Information Sciences: an International Journal
Policy search for motor primitives in robotics
Machine Learning
Function approximation via tile coding: automating parameter choice
SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Reinforcement learning from simultaneous human and MDP reward
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Multi-criteria expertness based cooperative Q-learning
Applied Intelligence
Hierarchical control of traffic signals using Q-learning with tile coding
Applied Intelligence
Point-based online value iteration algorithm in large POMDP
Applied Intelligence
Hi-index | 0.00 |
This paper considers the problem of extending Training an Agent Manually via Evaluative Reinforcement (TAMER) in continuous state and action spaces. Investigative research using the TAMER framework enables a non-technical human to train an agent through a natural form of human feedback (negative or positive). The advantages of TAMER have been shown on tasks of training agents by only human feedback or combining human feedback with environment rewards. However, these methods are originally designed for discrete state-action, or continuous state-discrete action problems. This paper proposes an extension of TAMER to allow both continuous states and actions, called ACTAMER. The new framework utilizes any general function approximation of a human trainer's feedback signal. Moreover, a combined capability of ACTAMER and reinforcement learning is also investigated and evaluated. The combination of human feedback and reinforcement learning is studied in both settings: sequential and simultaneous. Our experimental results demonstrate the proposed method successfully allowing a human to train an agent in two continuous state-action domains: Mountain Car and Cart-pole (balancing).