Using training regimens to teach expanding function approximators

Authors:
Peng Zang;Arya J. Irani;Peng Zhou;Charles L. Isbell, Jr.;Andrea L. Thomaz
Affiliations:
Georgia Institute of Technology, Atlanta, GA;Georgia Institute of Technology, Atlanta, GA;Georgia Institute of Technology, Atlanta, GA;Georgia Institute of Technology, Atlanta, GA;Georgia Institute of Technology, Atlanta, GA
Venue:
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Year:
2010

Citing 8
Cited 1

Robot shaping: developing autonomous agents through learning

Artificial Intelligence
Learning to solve multiple goals

Learning to solve multiple goals
The MAXQ Method for Hierarchical Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning evaluation functions for global optimization

Learning evaluation functions for global optimization
A survey of robot learning from demonstration

Robotics and Autonomous Systems
Learning to act using real-time dynamic programming

Artificial Intelligence
Real-time neuroevolution in the NERO video game

IEEE Transactions on Evolutionary Computation

Human-assisted neuroevolution through shaping, advice and examples

Proceedings of the 13th annual conference on Genetic and evolutionary computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In complex real-world environments, traditional (tabular) techniques for solving Reinforcement Learning (RL) do not scale. Function approximation is needed, but unfortunately, existing approaches generally have poor convergence and optimality guarantees. Additionally, for the case of human environments, it is valuable to be able to leverage human input. In this paper we introduce Expanding Value Function Approximation (EVFA), a function approximation algorithm that returns the optimal value function given sufficient rounds. To leverage human input, we introduce a new human-agent interaction scheme, training regimens, which allow humans to interact with and improve agent learning in the setting of a machine learning game. In experiments, we show EVFA compares favorably to standard value approximation approaches. We also show that training regimens enable humans to further improve EVFA performance. In our user study, we find that non-experts are able to provide effective regimens and that they found the game fun.