Bias and variance in value function estimation

  • Authors:
  • Shie Mannor;Duncan Simester;Peng Sun;John N. Tsitsiklis

  • Affiliations:
  • Massachusetts Institute of Technology, Cambridge, MA;Massachusetts Institute of Technology, Cambridge, MA;Duke University, Durham, NC;Massachusetts Institute of Technology, Cambridge, MA

  • Venue:
  • ICML '04 Proceedings of the twenty-first international conference on Machine learning
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the bias and variance of value function estimation that are caused by using an empirical model instead of the true model. We analyze these bias and variance for Markov processes from a classical (frequentist) statistical point of view, and in a Bayesian setting. Using a second order approximation, we provide explicit expressions for the bias and variance in terms of the transition counts and the reward statistics. We present supporting experiments with artificial Markov chains and with a large transactional database provided by a mail-order catalog firm.