A model for reasoning about persistence and causation
Computational Intelligence
Theory refinement on Bayesian networks
Proceedings of the seventh conference (1991) on Uncertainty in artificial intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Continuous-Time Hierarchical Reinforcement Learning
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Causal Graph Based Decomposition of Factored MDPs
The Journal of Machine Learning Research
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Efficient reinforcement learning in factored MDPs
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Max-norm projections for factored MDPs
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Active learning for structure in Bayesian networks
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Exploiting structure in policy construction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Unsupervised active learning in large domains
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Learning the structure of dynamic probabilistic networks
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Symbolic generalization for on-line planning
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Review: learning bayesian networks: Approaches and issues
The Knowledge Engineering Review
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Hi-index | 0.00 |
Several recent techniques for solving Markov decision processes use dynamic Bayesian networks to compactly represent tasks. The dynamic Bayesian network representation may not be given, in which case it is necessary to learn it if one wants to apply these techniques. We develop an algorithm for learning dynamic Bayesian network representations of Markov decision processes using data collected through exploration in the environment. To accelerate data collection we develop a novel scheme for active learning of the networks. We assume that it is not possible to sample the process in arbitrary states, only along trajectories, which prevents us from applying existing active learning techniques. Our active learning scheme selects actions that maximize the total entropy of distributions used to evaluate potential refinements of the networks.