An Empirical Evaluation of Interval Estimation for Markov Decision Processes

  • Authors:
  • Alexander L. Strehl;Michael L. Littman

  • Affiliations:
  • Rutgers University;Rutgers University

  • Venue:
  • ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper takes an empirical approach to evaluating three model-based reinforcement-learning methods. All methods intend to speed the learning process by mixing exploitation of learned knowledge with exploration of possibly promising alternatives. We consider 驴-greedy exploration, which is computationally cheap and popular, but unfocused in its exploration effort; R-Max exploration, a simplification of an exploration scheme that comes with a theoretical guarantee of efficiency; and a well-grounded approach, model-based interval estimation, that better integrates exploration and exploitation. Our experiments indicate that effective exploration can result in dramatic improvements in the observed rate of learning.