A task specification language for bootstrap learning

Authors:
Ian Fasel;Michael Quinlan;Peter Stone
Affiliations:
The University of Texas at Austin;The University of Texas at Austin;The University of Texas at Austin
Venue:
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Year:
2009

Citing 10
Cited 0

Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time

Machine Learning
Creating advice-taking reinforcement learners

Machine Learning - Special issue on reinforcement learning
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
The MAXQ Method for Hierarchical Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Feudal Reinforcement Learning

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Least-squares policy iteration

The Journal of Machine Learning Research
Behavior transfer for value-function-based reinforcement learning

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Using Homomorphisms to transfer options across continuous reinforcement learning domains

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Using advice to transfer knowledge acquired in one reinforcement learning task to another

ECML'05 Proceedings of the 16th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditionally, research in the reinforcement learning (RL) community has been devoted to developing domain-independent algorithms such as SARSA [13], Q-learning [16], prioritized sweeping [8], or LSPI [6], that are designed to work for any given state space and action space. However, the modus operandi in RL research has been for a human expert to re-code each learning environment, including defining the actions and state features, as well as specifying the algorithm to be used. Typically each new RL experiment is run by explicitly calling a new program (even when learning can be biased by previous learning experiences, as in transfer learning [10, 15, 14]). Thus, while standards have developed for describing and testing individual RL algorithms (e.g., RL-Glue [17]), no such standards have developed for the problem of describing complete tasks to a preexisting agent.