Exploration in relational domains for model-based reinforcement learning

Authors:
Tobias Lang;Marc Toussaint;Kristian Kersting
Affiliations:
Freie Universität Berlin, Machine Learning and Robotics Group, Berlin, Germany;Freie Universität Berlin, Machine Learning and Robotics Group, Berlin, Germany;Fraunhofer Institute for Intelligent Analysis and Information Systems, Knowledge Discovery Department, Sankt Augustin, Germany
Venue:
The Journal of Machine Learning Research
Year:
2012

Citing 43
Cited 0

A probabilistic training scheme for the time-concentration network

KBCS '89 Proceedings of the international conference on Knowledge based computer systems
Relational reinforcement learning

Machine Learning - Special issue on inducive logic programming
Foundations of Inductive Logic Programming

Foundations of Inductive Logic Programming
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Learning from Positive Data

ILP '96 Selected Papers from the 6th International Workshop on Inductive Logic Programming
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
Integrating Guidance into Relational Reinforcement Learning

Machine Learning
An analytic solution to discrete Bayesian reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Graph kernels and Gaussian processes for relational reinforcement learning

Machine Learning
Relational Dependency Networks

The Journal of Machine Learning Research
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)

Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
An object-oriented representation for efficient reinforcement learning

Proceedings of the 25th international conference on Machine learning
Active reinforcement learning

Proceedings of the 25th international conference on Machine learning
Non-parametric policy gradients: a unified treatment of propositional and relational domains

Proceedings of the 25th international conference on Machine learning
Transfer Learning in Reinforcement Learning Problems Through Partial Policy Recycling

ECML '07 Proceedings of the 18th European conference on Machine Learning
Practical solution techniques for first-order MDPs

Artificial Intelligence
Curriculum learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Near-Bayesian exploration in polynomial time

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Relevance Grounding for Planning in Relational Domains

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Efficient learning of action schemas and web-service descriptions

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
FLUCAP: a heuristic search planner for first-order MDPs

Journal of Artificial Intelligence Research
Learning symbolic models of stochastic domains

Journal of Artificial Intelligence Research
First order decision diagrams for relational MDPs

Journal of Artificial Intelligence Research
Active learning with statistical models

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Efficient reinforcement learning in factored MDPs

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Online learning and exploiting relational models in reinforcement learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Generalizing plans to new environments in relational MDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Symbolic dynamic programming for first-order MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Top-down induction of first-order logical decision trees

Artificial Intelligence
Reinforcement Learning in Finite MDPs: PAC Analysis

The Journal of Machine Learning Research
Learning models of relational MDPs using graph kernels

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Probabilistic inductive logic programming

Probabilistic inductive logic programming
Exploring compact reinforcement-learning representations with linear regression

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
A unifying framework for computational reinforcement learning theory

A unifying framework for computational reinforcement learning theory
Dimension reduction and its application to model-based exploration in continuous spaces

Machine Learning
Fast active exploration for link-based preference learning using Gaussian processes

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Planning with noisy probabilistic relational rules

Journal of Artificial Intelligence Research
Knows what it knows: a framework for self-aware learning

Machine Learning
Efficient learning of relational models for sequential decision making

Efficient learning of relational models for sequential decision making
An object-oriented representation for efficient reinforcement learning

An object-oriented representation for efficient reinforcement learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

A fundamental problem in reinforcement learning is balancing exploration and exploitation. We address this problem in the context of model-based reinforcement learning in large stochastic relational domains by developing relational extensions of the concepts of the E3 and R-MAX algorithms. Efficient exploration in exponentially large state spaces needs to exploit the generalization of the learned model: what in a propositional setting would be considered a novel situation and worth exploration may in the relational setting be a well-known context in which exploitation is promising. To address this we introduce relational count functions which generalize the classical notion of state and action visitation counts. We provide guarantees on the exploration efficiency of our framework using count functions under the assumption that we had a relational KWIK learner and a near-optimal planner. We propose a concrete exploration algorithm which integrates a practically efficient probabilistic rule learner and a relational planner (for which there are no guarantees, however) and employs the contexts of learned relational rules as features to model the novelty of states and actions. Our results in noisy 3D simulated robot manipulation problems and in domains of the international planning competition demonstrate that our approach is more effective than existing propositional and factored exploration techniques.