Effective short-term opponent exploitation in simplified poker

  • Authors:
  • Finnegan Southey;Bret Hoehn;Robert C. Holte

  • Affiliations:
  • Google, Mountain View, USA 94043;Dept. of Computing Science, University of Alberta, Edmonton, Canada T6G 2E8;Dept. of Computing Science, University of Alberta, Edmonton, Canada T6G 2E8

  • Venue:
  • Machine Learning
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Uncertainty in poker stems from two key sources, the shuffled deck and an adversary whose strategy is unknown. One approach to playing poker is to find a pessimistic game-theoretic solution (i.e., a Nash equilibrium), but human players have idiosyncratic weaknesses that can be exploited if some model or counter-strategy can be learned by observing their play. However, games against humans last for at most a few hundred hands, so learning must be very fast to be useful. We explore two approaches to opponent modelling in the context of Kuhn poker, a small game for which game-theoretic solutions are known. Parameter estimation and expert algorithms are both studied. Experiments demonstrate that, even in this small game, convergence to maximally exploitive solutions in a small number of hands is impractical, but that good (e.g., better than Nash) performance can be achieved in as few as 50 hands. Finally, we show that amongst a set of strategies with equal game-theoretic value, in particular the set of Nash equilibrium strategies, some are preferable because they speed learning of the opponent's strategy by exploring it more effectively.