Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Algorithms for Inverse Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Exploration and apprenticeship learning in reinforcement learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Motivated reinforcement learning for adaptive characters in open-ended simulation games
Proceedings of the international conference on Advances in computer entertainment technology
Hi-index | 0.00 |
Exploitation-oriented Learning (XoL) is a novel approach to goal-directed learning from interaction. Though reinforcement learning is much more focus on the learning and can gurantee the optimality in Markov Decision Processes (MDPs) environments, XoL aims to learn a rational policy , whose expected reward per an action is larger than zero, very quickly. We know PS-r* that is one of the XoL methods. It can learn an useful rational policy that is not inferior to a random walk in Partially Observed Markov Decision Processes (POMDPs) environments where the number of types of a reward is one. However, PS-r* requires O (MN 2) memories where N and M are the numbers of types of a sensory input and an action.In this paper, we propose PS-r# that can learn an useful rational policy in the POMDPs environments by O (MN ) memories. We confirm the effectiveness of PS-r# in numerical examples.