Just add Pepper: extending learning algorithms for repeated matrix games to repeated Markov games

Authors:
Jacob W. Crandall
Affiliations:
Masdar Institute of Science and Technology, Abu Dhabi, United Arab Emirates
Venue:
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Year:
2012

Citing 11
Cited 0

Technical Note: \cal Q-Learning

Machine Learning
Multiagent learning using a variable learning rate

Artificial Intelligence
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Friend-or-Foe Q-learning in General-Sum Games

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Convergence Problems of General-Sum Multiagent Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Gambling in a rigged casino: The adversarial multi-armed bandit problem

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
Multi-agent learning model with bargaining

Proceedings of the 38th conference on Winter simulation
Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning

Machine Learning
Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning in multi-agent settings has recently garnered much interest, the result of which has been the development of somewhat effective multi-agent learning (MAL) algorithms for repeated normal-form games. However, general-purpose MAL algorithms for richer environments, such as general-sum repeated stochastic (Markov) games (RSGs), are less advanced. Indeed, previously created MAL algorithms for RSGs are typically successful only when the behavior of associates meets specific game theoretic assumptions and when the game is of a particular class (such as zero-sum games). In this paper, we present a new algorithm, called Pepper, that can be used to extend MAL algorithms designed for repeated normal-form games to RSGs. We demonstrate that Pepper creates a family of new algorithms, each of whose asymptotic performance in RSGs is reminiscent of its asymptotic performance in related repeated normal-form games. We also show that some algorithms formed with Pepper outperform existing algorithms in an interesting RSG.