Provably Efficient Learning with Typed Parametric Models

Authors:
Emma Brunskill;Bethany R. Leffler;Lihong Li;Michael L. Littman;Nicholas Roy
Affiliations:
-;-;-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2009

Citing 18
Cited 1

Matrix analysis

Matrix analysis
The complexity of dynamic programming

Journal of Complexity
TD-Gammon, a self-teaching backgammon program, achieves master-level play

Neural Computation
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
Least-squares policy iteration

The Journal of Machine Learning Research
Exploration and apprenticeship learning in reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
An analytic solution to discrete Bayesian reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Reinforcement learning with limited reinforcement: using Bayes risk for active learning in POMDPs

Proceedings of the 25th international conference on Machine learning
Knows what it knows: a framework for self-aware learning

Proceedings of the 25th international conference on Machine learning
Cabernet: vehicular content delivery using WiFi

Proceedings of the 14th ACM international conference on Mobile computing and networking
Efficient reinforcement learning with relocatable action models

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Towards faster planning with continuous resources in stochastic domains

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Using linear programming for Bayesian exploration in Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Model-based exploration in continuous state spaces

SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
A unifying framework for computational reinforcement learning theory

A unifying framework for computational reinforcement learning theory
Bandit based monte-carlo planning

ECML'06 Proceedings of the 17th European conference on Machine Learning

Optimal motion planning by reinforcement learning in autonomous mobile vehicles

Robotica

Quantified Score

Hi-index	0.00

Visualization

Abstract

To quickly achieve good performance, reinforcement-learning algorithms for acting in large continuous-valued domains must use a representation that is both sufficiently powerful to capture important domain characteristics, and yet simultaneously allows generalization, or sharing, among experiences. Our algorithm balances this tradeoff by using a stochastic, switching, parametric dynamics representation. We argue that this model characterizes a number of significant, real-world domains, such as robot navigati on across varying terrain. We prove that this representational assumption allows our algorithm to be probably approximately correct with a sample complexity that scales polynomially with all problem-specific quantities including the state-space dimension. We also explicitly incorporate the error introduced by approximate planning in our sample complexity bounds, in contrast to prior Probably Approximately Correct (PAC) Markov Decision Processes (MDP) approaches, which typically assume the estimated MDP can be solved exactly. Our experimental results on constructing plans for driving to work using real car trajectory data, as well as a small robot experiment on navigating varying terrain, demonstrate that our dynamics representation enables us to capture real-world dynamics in a sufficient manner to produce good performance.