Motion planning in the presence of movable obstacles
SCG '88 Proceedings of the fourth annual symposium on Computational geometry
Automatic programming of behavior-based robots using reinforcement learning
Artificial Intelligence
Technical Note: \cal Q-Learning
Machine Learning
Efficient crawling through URL ordering
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Hierarchical Explanation-Based Reinforcement Learning
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Temporal credit assignment in reinforcement learning
Temporal credit assignment in reinforcement learning
Decomposition techniques for planning in stochastic domains
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Rapid, safe, and incremental learning of navigation strategies
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Exploration and exploitation balance management in fuzzy reinforcement learning
Fuzzy Sets and Systems
Self-organizing state aggregation for architecture design of Q-learning
Information Sciences: an International Journal
A multi-viewpoint system to support abductive reasoning
Information Sciences: an International Journal
Backward Q-learning: The combination of Sarsa algorithm and Q-learning
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
The extension of reinforcement learning (RL) to large state space has inevitably encountered the problem of the curse of dimensionality. Improving the learning efficiency of the agent is much more important to the practical application of RL. Consider learning to optimally solve Markov decision problems in a particular domain, if the domain has particular characteristics that are attributable to each state, the agent might be able to take advantage of these features to direct the future learning. This paper firstly defines the local state feature, then a state feature function is used to generate the local state features of a state. Also a weight function is introduced to adjust current policy to the actions worth exploring. Based on the above, an improved SARSA algorithm, Feature-SARSA, is proposed. We validate our new algorithm by experiment on a complex domain, named Sokoban. The results show that the new algorithm has better performance.