How a genetic algorithm learns to play traveler's dilemma by choosing dominated strategies to achieve greater payoffs

  • Authors:
  • Michele Pace

  • Affiliations:
  • INRIA Bordeaux-Sud Ouest, Institute of Mathematics of Bordeaux

  • Venue:
  • CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In game theory, the Traveler's Dilemma (abbreviated TD) is a non-zero-sum game in which two players attempt to maximize their own payoff without deliberately willing to damage the opponent. In the classical formulation of this problem, game theory predicts that, if both players are purely rational, they will always choose the strategy corresponding to the Nash equilibrium for the game. However, when played experimentally, most human players select much higher values (usually close to $100), deviating strongly from the Nash equilibrium and obtaining, on average, much higher rewards. In this paper we analyze the behaviour of a genetic algorithm that, by repeatedly playing the game, evolves the strategy in order to maximize the payoffs. In the algorithm, the population has no a priori knowledge about the game. The fitness function rewards the individuals who obtain high payoffs at the end of each game session. We demonstrate that, when it is possible to assign to each strategy a probability measure, then the search for good strategies can be effectively translated into a problem of search in a measure space using, for example, genetic algorithms. Furthermore, the codification of the genome as a probability distribution allows the analysis of common crossover and mutation operators in the uncommon case where the genome is a probability measure.