Technical Note: \cal Q-Learning
Machine Learning
Target Reaching by Using Visual Information and Q-learning Controllers
Autonomous Robots
Learning behavior-selection by emotions and cognition in a multi-goal robot task
The Journal of Machine Learning Research
The Journal of Machine Learning Research
An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games
Autonomous Agents and Multi-Agent Systems
A two-layered multi-agent reinforcement learning model and algorithm
Journal of Network and Computer Applications
Shaping multi-agent systems with gradient reinforcement learning
Autonomous Agents and Multi-Agent Systems
Optimistic-Pessimistic Q-Learning Algorithm for Multi-Agent Systems
MATES '08 Proceedings of the 6th German conference on Multiagent System Technologies
A reinforcement learning model for supply chain ordering management: An application to the beer game
Decision Support Systems
Dynamic packaging in e-retailing with stochastic demand over finite horizons: A Q-learning approach
Expert Systems with Applications: An International Journal
Engineering Applications of Artificial Intelligence
Adaptive learning algorithm of self-organizing teams
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
This paper studies a multi-goal Q-learning algorithm of cooperative teams. Member of the cooperative teams is simulated by an agent. In the virtual cooperative team, agents adapt its knowledge according to cooperative principles. The multi-goal Q-learning algorithm is approached to the multiple learning goals. In the virtual team, agents learn what knowledge to adopt and how much to learn (choosing learning radius). The learning radius is interpreted in Section 3.1. Five basic experiments are manipulated proving the validity of the multi-goal Q-learning algorithm. It is found that the learning algorithm causes agents to converge to optimal actions, based on agents' continually updated cognitive maps of how actions influence learning goals. It is also proved that the learning algorithm is beneficial to the multiple goals. Furthermore, the paper analyzes how sensitive the learning performance is affected by the parameter values of the learning algorithm.