Online calibrated forecasts: Memory efficiency versus universality for learning in games

  • Authors:
  • Shie Mannor;Jeff S. Shamma;Gürdal Arslan

  • Affiliations:
  • Department of Electrical and Computer Engineering, McGill University, Montreal, Canada H3A-2A7;Department of Mechanical and Aerospace Engineering, University of California - Los Angeles, Los Angeles 90095-1597;Department of Electrical Engineering, University of Hawaii at Manoa, Honolulu 96822

  • Venue:
  • Machine Learning
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We provide a simple learning process that enables an agent to forecast a sequence of outcomes. Our forecasting scheme, termed tracking forecast, is based on tracking the past observations while emphasizing recent outcomes. As opposed to other forecasting schemes, we sacrifice universality in favor of a significantly reduced memory requirements. We show that if the sequence of outcomes has certain properties--it has some internal (hidden) state that does not change too rapidly--then the tracking forecast is weakly calibrated so that the forecast appears to be correct most of the time. For binary outcomes, this result holds without any internal state assumptions. We consider learning in a repeated strategic game where each player attempts to compute some forecast of the opponent actions and play a best response to it. We show that if one of the players uses a tracking forecast, while the other player uses a standard learning algorithm (such as exponential regret matching or smooth fictitious play), then the player using the tracking forecast obtains the best response to the actual play of the other players. We further show that if both players use tracking forecast, then under certain conditions on the game matrix, convergence to a Nash equilibrium is possible with positive probability for a larger class of games than the class of games for which smooth fictitious play converges to a Nash equilibrium.