Advances in Neural Information Processing Systems 5, [NIPS Conference]
Modular-Fuzzy Cooperation Algorithm for Multi-agent Systems
ADVIS '02 Proceedings of the Second International Conference on Advances in Information Systems
Q-error as a selection mechanism in modular reinforcement-learning systems
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Hi-index | 0.00 |
Reinforcement learning is a technique to learn suitable action policies that maximize utility, via the clue of reinforcement signals: reward or punishment. Q-learning, a widely used reinforcement learning method, has been analyzed in much research on autonomous agents. However, as the size of the problem space increases, agents need more computational resources and require more time to learn appropriate policies. Whitehead proposed an architecture called modular Q-learning, that decomposes the whole problem space into smaller subproblem spaces, and distributes them among multiple modules. Thus, each module takes charge of part of the whole problem. In modular Q-learning, however, human designers have to decompose the problem space, and create a suitable set of modules manually. Agents with such a fixed module architecture cannot adapt themselves to dynamic environments. Here, we propose a new architecture for reinforcement learning called AMQL (Automatic Modular Q-Learning), that enables agents to obtain a suitable set of modules by themselves using a selection method. Through experiments, we show that agents can automatically obtain suitable modules to gain a reward. Furthermore, we show that agents can adapt themselves to dynamic environments efficiently, through reconstructing modules.