Multiple model-based reinforcement learning

Authors:
Kenji Doya;Kazuyuki Samejima;Ken-ichi Katagiri;Mitsuo Kawato
Affiliations:
Human Information Science Laboratories, ATR International, and CREST and Kawato Dynamic Brain Project, ERATO, Japan Science and Technology Corporation and Nara Institute of Science and Technology, ...;Human Information Science Laboratories, ATR International and Kawato Dynamic Brain Project, ERATO, Japan Science and Technology Corporation, Seika, Soraku, Kyoto 619-0288, Japan;ATR Human Information Processing Research Laboratories, Seika, Soraku, Kyoto 619-0288, Japan and Nara Institute of Science and Technology, Ikoma, Nara 630-0101, Japan;Human Information Science Laboratories, ATR International and Kawato Dynamic Brain Project, ERATO, Japan Science and Technology Corporation, Seika, Soraku, Kyoto 619-0288, Japan and Nara Institute ...
Venue:
Neural Computation
Year:
2002

Citing 15
Cited 40

Transfer of Learning by Composing Solutions of Elemental Sequential Tasks

Machine Learning
Original Contribution: Recognition of manipulated objects by motor learning with modular architecture networks

Neural Networks
HQ-learning

Adaptive Behavior
Multiple paired forward and inverse models for motor control

Neural Networks - Special issue on neural control and robotics: biology and technology
Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems

Neural Networks - Special issue on organisation of computation in brain-like systems
Multiple paired forward-inverse models for human motor learning and control

Proceedings of the 1998 conference on Advances in neural information processing systems II
Dynamic Programming and Optimal Control

Dynamic Programming and Optimal Control
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Feudal Reinforcement Learning

Advances in Neural Information Processing Systems 5, [NIPS Conference]
MOSAIC Model for Sensorimotor Learning and Control

Neural Computation
Reinforcement Learning in Continuous Time and Space

Neural Computation
Annealed competition of experts for a segmentation and classification of switching dynamics

Neural Computation

Inter-module credit assignment in modular reinforcement learning

Neural Networks
Actor-Critic Models of Reinforcement Learning in the Basal Ganglia: From Natural to Artificial Rats

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Self-organization of distributedly represented multiple behavior schemata in a mirror system: reviews of robot experiments using RNNPB

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Motor primitive and sequence self-organization in a hierarchical recurrent neural network

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Dealing with non-stationary environments using context detection

ICML '06 Proceedings of the 23rd international conference on Machine learning
A Model of Prefrontal Cortical Mechanisms for Goal-directed Behavior

Journal of Cognitive Neuroscience
Improving reinforcement learning with context detection

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Modeling of autonomous problem solving process by dynamic construction of task models in multiple tasks environment

Neural Networks - 2006 Special issue: Neurobiology of decision making
Multiple model-based reinforcement learning explains dopamine neuronal activity

Neural Networks
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Incremental acquisition of multiple nonlinear forward models based on differentiation process of schema model

Neural Networks
Self-organized Reinforcement Learning Based on Policy Gradient in Nonstationary Environments

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Modular Neural Networks for Model-Free Behavioral Learning

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Opportunities for multiagent systems and multiagent reinforcement learning in traffic control

Autonomous Agents and Multi-Agent Systems
Basal Ganglia Models for Autonomous Behavior Learning

Creating Brain-Like Intelligence
AMA-MOSAICI: An automatic module assigning hierarchical structure to control human motion based on movement decomposition

Neurocomputing
2009 Special Issue: Exploiting co-adaptation for the design of symbiotic neuroprosthetic assistants

Neural Networks
A Two-Level Model of Anticipation-Based Motor Learning for Whole Body Motion

Anticipatory Behavior in Adaptive Learning Systems
2009 Special Issue: Explorations on artificial time perception

Neural Networks
An internal model for acquisition and retention of motor learning during arm reaching

Neural Computation
RL-CD: dealing with non-stationarity in reinforcement learning

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Development of Symbiotic Brain-Machine Interfaces Using a Neurophysiology Cyberworkstation

Proceedings of the 13th International Conference on Human-Computer Interaction. Part II: Novel Interaction Methods and Techniques
Intelligence Dynamics: a concept and preliminary experiments for open-ended learning agents

Autonomous Agents and Multi-Agent Systems
Levels and Types of Action Selection: The Action Selection Soup

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Hierarchical Architecture with Modular Network SOM and Modular Reinforcement Learning

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Constructing action set from basis functions for reinforcement learning of robot control

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Reinforcement learning of multiple tasks using parametric bias

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Time perception in shaping cognitive neurodynamics of artificial agents

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Switching between different state representations in reinforcement learning

AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
Prerequesites for symbiotic brain-machine interfaces

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Neuronal replicators solve the stability-plasticity dilemma

Proceedings of the 12th annual conference on Genetic and evolutionary computation
The neuronal replicator hypothesis

Neural Computation
eMOSAIC model for humanoid robot control

SAB'10 Proceedings of the 11th international conference on Simulation of adaptive behavior: from animals to animats
Combining self-organizing maps with mixtures of experts: application to an actor-critic model of reinforcement learning in the basal ganglia

SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior
Mosaic for multiple-reward environments

Neural Computation
An online adaptation control system using mnSOM

ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
The eMOSAIC model for humanoid robot control

Neural Networks
Q-error as a selection mechanism in modular reinforcement-learning systems

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Extraction of primitive representation from captured human movements and measured ground reaction force to generate physically consistent imitated behaviors

Neural Networks
DCOB: Action space for reinforcement learning of high DoF robots

Autonomous Robots

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state prediction model and a reinforcement learning controller. The "responsibility signal," which is given by the softmax function of the prediction errors, is used to weight the outputs of multiple modules, as well as to gate the learning of the prediction models and the reinforcement learning controllers. We formulate MMRL for both discrete-time, finite-state case and continuous-time, continuous-state case. The performance of MMRL was demonstrated for discrete case in a nonstationary hunting task in a grid world and for continuous case in a nonlinear, nonstationary control task of swinging up a pendulum with variable physical parameters.