Multiagent learning in the presence of memory-bounded agents

Authors:
Doran Chakraborty;Peter Stone
Affiliations:
Microsoft, Sunnyvale, USA 94089;Department of Computer Science, The University of Texas, Austin, USA 78701
Venue:
Autonomous Agents and Multi-Agent Systems
Year:
2014

Citing 28
Cited 0

Technical Note: \cal Q-Learning

Machine Learning
A randomization rule for selecting forecasts

Operations Research
Average reward reinforcement learning: foundations, algorithms, and empirical results

Machine Learning - Special issue on reinforcement learning
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Industrial and practical applications of DAI

Multiagent systems
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Introduction to Multiagent Systems

Introduction to Multiagent Systems
Multiagent Systems: A Survey from a Machine Learning Perspective

Autonomous Robots
Convergence of Gradient Dynamics with a Variable Learning Rate

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Implicit Negotiation in Repeated Games

ATAL '01 Revised Papers from the 8th International Workshop on Intelligent Agents VIII
Settling the Complexity of Two-Player Nash Equilibrium

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
A general criterion and an algorithmic framework for learning in multi-agent systems

Machine Learning
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents

Machine Learning
What evolutionary game theory tells us about multiagent learning

Artificial Intelligence
Online Multiagent Learning against Memory Bounded Adversaries

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Effective short-term opponent exploitation in simplified poker

Machine Learning
Performance bounded reinforcement learning in strategic interactions

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Fast concurrent reinforcement learners

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Rational and convergent learning in stochastic games

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Learning against opponents with bounded memory

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Frequency adjusted multi-agent Q-learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
TacTex09: a champion bidding agent for ad auctions

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
EA2: The Winning Strategy for the Inaugural Lemonade Stand Game Tournament

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Nash convergence of gradient dynamics in general-sum games

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
No free lunch theorems for optimization

IEEE Transactions on Evolutionary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, great strides have been made towards creating autonomous agents that can learn via interaction with their environment. When considering just an individual agent, it is often appropriate to model the world as being stationary, meaning that the same action from the same state will always yield the same (possibly stochastic) effects. However, in the presence of other independent agents, the environment is not stationary: an action's effects may depend on the actions of the other agents. This non-stationarity poses the primary challenge of multiagent learning and comprises the main reason that it is best considered distinctly from single agent learning. The multiagent learning problem is often studied in the stylized settings provided by repeated matrix games. The goal of this article is to introduce a novel multiagent learning algorithm for such a setting, called Convergence with Model Learning and Safety (or CMLeS), that achieves a new set of objectives which have not been previously achieved. Specifically, CMLeS is the first multiagent learning algorithm to achieve the following three objectives: (1) converges to following a Nash equilibrium joint-policy in self-play; (2) achieves close to the best response when interacting with a set of memory-bounded agents whose memory size is upper bounded by a known value; and (3) ensures an individual return that is very close to its security value when interacting with any other set of agents. Our presentation of CMLeS is backed by a rigorous theoretical analysis, including an analysis of sample complexity wherever applicable.