Active perception and reinforcement learning
Neural Computation
Technical Note: \cal Q-Learning
Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
A Modular Approach to Multi-Agent Reinforcement Learning
ECAI '96 Selected papers from the Workshop on Distributed Artificial Intelligence Meets Machine Learning, Learning in Multi-Agent Environments
Multiagent Coordination with Learning Classifier Systems
IJCAI '95 Proceedings of the Workshop on Adaption and Learning in Multi-Agent Systems
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Experience-based reinforcement learning to acquire effective behavior in a multi-agent domain
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Mixing Greedy and Evolutive Approaches to Improve Pursuit Strategies
IBERAMIA '08 Proceedings of the 11th Ibero-American conference on AI: Advances in Artificial Intelligence
Cooperation Strategies for Pursuit Games: From a Greedy to an Evolutive Approach
MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Memetic algorithm with strategic controller for the maximum clique problem
Proceedings of the 2011 ACM Symposium on Applied Computing
Hi-index | 0.00 |
This paper presents a novel model of reinforcement learning agents. A feature of our learning agent model is to integrate analytic hierarchy process (AHP) into a standard reinforcement learning agent model, which consists of three modules: state recognition, learning, and action selecting modules. In our model, AHP module is designed with primary knowledge that human intrinsically should have in order to attain a goal state. This aims at increasing promising actions of agent especially in the earlier stages of learning instead of completely random actions as in the standard reinforcement learning algorithms. We adopt profit-sharing as a reinforcement learning algorithm and demonstrate the potential of our approach on two learning problems of a pursuit problem and a Sokoban problem with deadlock in the grid-world domains, where results indicate that the learning time can be decreased considerably for the problems and our approach efficiently avoids the deadlock for the Sokoban problem. We also show that bad effect that can be usually observed by introducing a priori knowledge into reinforcement learning process can be restrained by a method that decreases a rate of using knowledge during learning.