Learning to Perceive and Act by Trial and Error
Machine Learning
Technical Note: \cal Q-Learning
Machine Learning
Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Temporal credit assignment in reinforcement learning
Temporal credit assignment in reinforcement learning
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
Watkins' Q-learning is the most popular and an effective model-free method. However, comparing model-based approach, Q-learning with various exploration strategies require a large number of trial-and-error interactions for finding an optimal policy. To overcome this drawback, we propose a new model-based learning method extending Q-learning. This method has separated EI and ER functions for learning exploitation-based and exploration-based model, respectively. EI function based on statistics indicates the best action. The another ER function based on the information of exploration leads the learner to well-unknown region in the global state space by backing up in each step. Then, we introduce a new criterion as the information of exploration. Using combined these function, we can effectively proceed exploitation and exploration strategies and can select an action which considers each strategy simultaneously.