Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
The role of the asymptotic equipartition property in noiseless source coding
IEEE Transactions on Information Theory
A new criterion using information gain for action selection strategy in reinforcement learning
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
In the framework of reinforcement learning, an agent learns an optimal policy via return maximization, not via the instructed choices by a supervisor. The framework is in general formulated as an ergodic Markov decision process and is designed by tuning some parameters of the action-selection strategy so that the learning process eventually becomes almost stationary. In this paper, we examine a theoretical class of more general processes such that the agent can achieve return maximization by considering the asymptotic equipartition property of such processes. As a result, we show several necessary conditions that the agent and the environment have to satisfy for possible return maximization.