Adaptive Behavior
Multiple paired forward and inverse models for motor control
Neural Networks - Special issue on neural control and robotics: biology and technology
Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Multiple model-based reinforcement learning
Neural Computation
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Reinforcement Learning in Continuous Time and Space
Neural Computation
Self-stabilizing human-like motion control framework for humanoids using neural oscillators
ICIC'09 Proceedings of the 5th international conference on Emerging intelligent computing technology and applications
The eMOSAIC model for humanoid robot control
Neural Networks
Q-error as a selection mechanism in modular reinforcement-learning systems
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
A distributed Q-learning approach for variable attention to multiple critics
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
Critical issues in modular or hierarchical reinforcement learning (RL) are (i) how to decompose a task into sub-tasks, (ii) how to achieve independence of learning of sub-tasks, and (iii) how to assure optimality of the composite policy for the entire task. The second and last requirements are often under trade-off. We propose a method for propagating the reward for the entire task achievement between modules. This is done in the form of a 'modular reward', which is calculated from the temporal difference of the module gating signal and the value of the succeeding module. We implement modular reward for a multiple model-based reinforcement learning (MMRL) architecture and show its effectiveness in simulations of a pursuit task with hidden states and a continuous-time non-linear control task.