Active learning of dynamic Bayesian networks in Markov decision processes

  • Authors:
  • Anders Jonsson;Andrew Barto

  • Affiliations:
  • Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain;Autonomous Learning Laboratory, Department of Computer Science, University of Massachusetts, Amherst, MA

  • Venue:
  • SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Several recent techniques for solving Markov decision processes use dynamic Bayesian networks to compactly represent tasks. The dynamic Bayesian network representation may not be given, in which case it is necessary to learn it if one wants to apply these techniques. We develop an algorithm for learning dynamic Bayesian network representations of Markov decision processes using data collected through exploration in the environment. To accelerate data collection we develop a novel scheme for active learning of the networks. We assume that it is not possible to sample the process in arbitrary states, only along trajectories, which prevents us from applying existing active learning techniques. Our active learning scheme selects actions that maximize the total entropy of distributions used to evaluate potential refinements of the networks.