Automated equilibrium analysis of repeated games with private monitoring: a POMDP approach

  • Authors:
  • Yong Joon Joe;Atsushi Iwasaki;Michihiro Kandori;Ichiro Obara;Makoto Yokoo

  • Affiliations:
  • Kyushu University, Japan;Kyushu University, Japan;University of Tokyo, Japan;UCLA, California;Kyushu University, Japan

  • Venue:
  • Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The present paper investigates repeated games with imperfect private monitoring, where each player privately receives a noisy observation (signal) of the opponent's action. Such games have been paid considerable attention in the AI and economics literature. Identifying pure strategy equilibria in this class has been known as a hard open problem. Recently, we showed that the theory of partially observable Markov decision processes (POMDP) can be applied to identify a class of equilibria where the equilibrium behavior can be described by a finite state automaton (FSA). However, they did not provide a practical method or a program to apply their general idea to actual problems. We first develop a program that acts as a wrapper of a standard POMDP solver, which takes a description of a repeated game with private monitoring and an FSA as inputs, and automatically checks whether the FSA constitutes a symmetric equilibrium. We apply our program to repeated Prisoner's dilemma and find a novel class of FSA, which we call k-period mutual punishment (k-MP). The k-MP starts with cooperation and defects after observing a defection. It restores cooperation after observing defections k-times in a row. Our program enables us to exhaustively search for all FSAs with at most three states, and we found that 2-MP beats all the other pure strategy equilibria with at most three states for some range of parameter values and it is more efficient in an equilibrium than the grim-trigger.