Computing equilibria in multiplayer stochastic games of imperfect information

Authors:
Sam Ganzfried;Tuomas Sandholm
Affiliations:
Department of Computer Science, Carnegie Mellon University;Department of Computer Science, Carnegie Mellon University
Venue:
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Year:
2009

Citing 11
Cited 4

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Friend-or-Foe Q-learning in General-Sum Games

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Fast Planning in Stochastic Games

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Efficient algorithms for online decision problems

Journal of Computer and System Sciences - Special issue: Learning theory 2003
Settling the Complexity of Two-Player Nash Equilibrium

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
A near-optimal strategy for a heads-up no-limit Texas Hold'em poker tournament

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
A heads-up no-limit Texas Hold'em poker player: discretized betting models and automatically generated equilibrium-finding programs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
Computing an approximate jam/fold equilibrium for 3-player no-limit Texas Hold'em tournaments

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
A competitive Texas Hold'em poker player via automated abstraction and real-time equilibrium computation

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Learning to Coordinate Efficiently: a model-based approach

Journal of Artificial Intelligence Research

Using counterfactual regret minimization to create competitive multiplayer poker agents

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Computing equilibria by incorporating qualitative models?

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Game theory for cyber security

Proceedings of the Sixth Annual Workshop on Cyber Security and Information Intelligence Research
Computing pure Bayesian-Nash equilibria in games with finite actions and continuous types

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computing a Nash equilibrium in multiplayer stochastic games is a notoriously difficult problem. Prior algorithms have been proven to converge in extremely limited settings and have only been tested on small problems. In contrast, we recently presented an algorithm for computing approximate jam/fold equilibrium strategies in a three-player nolimit Texas hold'em tournament--a very large real-world stochastic game of imperfect information [5]. In this paper we show that it is possible for that algorithm to converge to a non-equilibrium strategy profile. However, we develop an ex post procedure that determines exactly how much each player can gain by deviating from his strategy and confirm that the strategies computed in that paper actually do constitute an ε-equilibrium for a very small ε (0.5% of the tournament entry fee). Next, we develop several new algorithms for computing a Nash equilibrium in multiplayer stochastic games (with perfect or imperfect information) which can provably never converge to a non-equilibrium. Experiments show that one of these algorithms outperforms the original algorithm on the same poker tournament. In short, we present the first algorithms for provably computing an ε-equilibrium of a large stochastic game for small ε. Finally, we present an efficient algorithm that minimizes external regret in both the perfect and imperfect information cases.