Learning automata: an introduction
Learning automata: an introduction
Nash q-learning for general-sum stochastic games
The Journal of Machine Learning Research
Planning, learning and coordination in multiagent decision processes
TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
Networks of Learning Automata: Techniques for Online Stochastic Optimization
Networks of Learning Automata: Techniques for Online Stochastic Optimization
Exploring selfish reinforcement learning in repeated games with stochastic rewards
Autonomous Agents and Multi-Agent Systems
Learning automata as a basis for multi agent reinforcement learning
LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Learning the global maximum with parameterized learning automata
IEEE Transactions on Neural Networks
Switching dynamics of multi-agent learning
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Formalizing Multi-state Learning Dynamics
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Evolutionary dynamics of ant colony optimization
MATES'12 Proceedings of the 10th German conference on Multiagent System Technologies
Hi-index | 0.00 |
Learning Automata (LA) were recently shown to be valuable tools for designing Multi-Agent Reinforcement Learning algorithms. One of the principal contributions of LA theory is that a set of decentralized, independent learning automata is able to control a finite Markov Chain with unknown transition probabilities and rewards. This result was recently extended to Markov Games and analyzed with the use of limiting games. In this paper we continue this analysis but we assume here that our agents are fully ignorant about the other agents in the environment, i.e. they can only observe themselves; they do not know how many other agents are present in the environment, the actions these other agents took, the rewards they received for this, or the location they occupy in the state space. We prove that in Markov Games, where agents have this limited type of observability, a network of independent LA is still able to converge to an equilibrium point of the underlying limiting game, provided a common ergodic assumption and provided the agents do not interfere each other's transition probabilities.