Adaptive play in Texas Hold'em Poker

Authors:
Raphaël Maîtrepierre;Jérémie Mary;Rémi Munos
Affiliations:
INRIA LILLE NORD EUROPE, France, email: raphael.maitrepierre@inria.fr;INRIA LILLE NORD EUROPE, France, email: jeremie.mary@inria.fr;INRIA LILLE NORD EUROPE, France, email: remi.munos@inria.fr
Venue:
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Year:
2008

Citing 7
Cited 1

Using probabilistic knowledge and simulation to play poker

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Deep Blue

Artificial Intelligence - Chips challenging champions: games, computers and Artificial Intelligence
The challenge of poker

Artificial Intelligence - Chips challenging champions: games, computers and Artificial Intelligence
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Finding equilibria in large sequential games of imperfect information

EC '06 Proceedings of the 7th ACM conference on Electronic commerce
Better automated abstraction techniques for imperfect information games, with application to Texas Hold'em poker

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Game-Tree search with adaptation in stochastic imperfect-information games

CG'04 Proceedings of the 4th international conference on Computers and Games

Computer poker: A review

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a Texas Hold'em poker player for limit headsup games. Our bot is designed to adapt automatically to the strategy of the opponent and is not based on Nash equilibrium computation. The main idea is to design a bot that builds beliefs on his opponent's hand. A forest of game trees is generated according to those beliefs and the solutions of the trees are combined to make the best decision. The beliefs are updated during the game according to several methods, each of which corresponding to a basic strategy. We then use an exploration-exploitation bandit algorithm, namely the UCB (Upper Confidence Bound), to select a strategy to follow. This results in a global play that takes into account the opponent's strategy, and which turns out to be rather unpredictable. Indeed, if a given strategy is exploited by an opponent, the UCB algorithm will detect it using change point detection, and will choose another one. The initial resulting program, called Brennus, participated to the AAAI'07 Computer Poker Competition in both online and equilibrium competition and ranked eight out of seventeen competitors.