Exploring compact reinforcement-learning representations with linear regression

Authors:
Thomas J. Walsh;István Szita;Carlos Diuk;Michael L. Littman
Affiliations:
Rutgers University, Piscataway, NJ;University of Alberta, Edmonton, AB, Canada;Rutgers University, Piscataway, NJ;Rutgers University, Piscataway, NJ
Venue:
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Year:
2009

Citing 16
Cited 9

PAC adaptive control of linear systems

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Technical Update: Least-Squares Temporal Difference Learning

Machine Learning
Efficient Reinforcement Learning in Factored MDPs

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
An Improved On-line Algorithm for Learning Linear Evaluation Functions

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
An object-oriented representation for efficient reinforcement learning

Proceedings of the 25th international conference on Machine learning
Knows what it knows: a framework for self-aware learning

Proceedings of the 25th international conference on Machine learning
The many faces of optimism: a unifying approach

Proceedings of the 25th international conference on Machine learning
The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Improving action selection in MDP's via knowledge transfer

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Efficient structure learning in factored-state MDPs

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
The first probabilistic track of the international planning competition

Journal of Artificial Intelligence Research
Learning symbolic models of stochastic domains

Journal of Artificial Intelligence Research
STRIPS: a new approach to the application of theorem proving to problem solving

IJCAI'71 Proceedings of the 2nd international joint conference on Artificial intelligence
A unifying framework for computational reinforcement learning theory

A unifying framework for computational reinforcement learning theory

Reinforcement Learning in Finite MDPs: PAC Analysis

The Journal of Machine Learning Research
A contextual-bandit approach to personalized news article recommendation

Proceedings of the 19th international conference on World wide web
Incremental learning of relational action models in noisy environments

ILP'10 Proceedings of the 20th international conference on Inductive logic programming
Data mining and model trees study on GDP and its influence factors

AIASABEBI'11 Proceedings of the 11th WSEAS international conference on Applied informatics and communications, and Proceedings of the 4th WSEAS International conference on Biomedical electronics and biomedical informatics, and Proceedings of the international conference on Computational engineering in systems applications
Handling ambiguous effects in action learning

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Active learning of relational action models

ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming
TEXPLORE: real-time sample-efficient reinforcement learning for robots

Machine Learning
Exploration in relational domains for model-based reinforcement learning

The Journal of Machine Learning Research
Interactive collaborative filtering

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new algorithm for online linear regression whose efficiency guarantees satisfy the requirements of the KWIK (Knows What It Knows) framework. The algorithm improves on the complexity bounds of the current state-of-the-art procedure in this setting. We explore several applications of this algorithm for learning compact reinforcement-learning representations. We show that KWIK linear regression can be used to learn the reward function of a factored MDP and the probabilities of action outcomes in Stochastic STRIPS and Object Oriented MDPs, none of which have been proven to be efficiently learnable in the RL setting before. We also combine KWIK linear regression with other KWIK learners to learn larger portions of these models, including experiments on learning factored MDP transition and reward functions together.