Adaptive choice of grid and time in reinforcement learning
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Using Options for Knowledge Transfer in Reinforcement Learning TITLE2:
Using Options for Knowledge Transfer in Reinforcement Learning TITLE2:
Hierarchical control and learning for markov decision processes
Hierarchical control and learning for markov decision processes
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Recursive Adaptation of Stepsize Parameter for Non-stationary Environments
PRIMA '09 Proceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems
Learning to control at multiple time scales
ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
Recursive adaptation of stepsize parameter for non-stationary environments
ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Wireless Personal Communications: An International Journal
Hi-index | 0.00 |
In recent years hierarchical concepts of temporal abstraction have been integrated in the reinforcement learning framework to improve scalability. However, existing approaches are limited to domains where a decomposition into subtasks is known a priori. In this paper we propose the concept of explicitly selecting time scale related actions if no subgoalrelated abstract actions are available. This is realised with multistep actions on different time scales that are combined in one single action set. The special structure of the action set is exploited in the MSAQ-learning algorithm. By learning on different explicitly specified time scales simultaneously, a considerable improvement of learning speed can be achieved. This is demonstrated on two benchmark problems.