Creating advice-taking reinforcement learners
Machine Learning - Special issue on reinforcement learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning Communication Strategies in Multiagent Systems
Applied Intelligence
Cooperative Multi-Agent Learning: The State of the Art
Autonomous Agents and Multi-Agent Systems
Cooperative Q Learning Based on Blackboard Architecture
CISW '07 Proceedings of the 2007 International Conference on Computational Intelligence and Security Workshops
Cooperative learning using advice exchange
Adaptive agents and multi-agent systems
Comparison and analysis of expertness measure in knowledge sharing among robots
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Expertness based cooperative Q-learning
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Monte-Carlo tree search for Bayesian reinforcement learning
Applied Intelligence
Learning via human feedback in continuous state and action spaces
Applied Intelligence
Point-based online value iteration algorithm in large POMDP
Applied Intelligence
Hi-index | 0.00 |
One of the most influential points in cooperative learning is the type of exchanging information. If the content of exchanging information among agents is rich, cooperation gives rise to better results. To extract proper knowledge of agents during the cooperation process, some expertness measures that assign expertness levels to the other agents are used. In this paper, a new method named Multi-Criteria Expertness based cooperative Q-learning (MCE) is proposed that utilizes all of the expertness measures and attempts to enrich the exchanging information more efficiently. In MCE, all expertness measures are considered simultaneously and collective knowledge is equal to the combination of learned knowledge by each of expertness measures. The experimental results confirm outstanding performance of the proposed method on a sample maze world and a hunter-prey problem.