Average reward reinforcement learning: foundations, algorithms, and empirical results
Machine Learning - Special issue on reinforcement learning
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Multiagent learning using a variable learning rate
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Friend-or-Foe Q-learning in General-Sum Games
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Nash Convergence of Gradient Dynamics in General-Sum Games
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games
Autonomous Agents and Multi-Agent Systems
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
An applied optimization framework for distributed air transportation environments
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
Learning in a multi-agent system is difficult because the learning environment jointly created by all learning agents is time-variant. This paper studies the model of multi-agent learning in complete-information extensive games (CEGs). We provide two provably convergent algorithms for this model. Both algorithms utilize the special structure of CEGs and guarantee both individual and collective convergence. Our work contributes to the multi-agent learning literature in several aspects: 1. We identify a model of multi-agent learning, namely, learning in CEGs, and provide two provably convergent algorithms for this model. 2. We explicitly address the environment-shifting problem and show that how patient agents can collectively learn to play equilibrium strategies. 3. Many game-theoretical work on learning uses a technique called fictitious play, which requires agents to build beliefs about their opponents. For our model of learning in CEGs, we show it is true that agents can collectively converge to the sub-game perfect equilibrium (SPE) by repeatedly reinforcing their previous success/failure experience; no belief building is necessary.