PAC adaptive control of linear systems
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Technical Update: Least-Squares Temporal Difference Learning
Machine Learning
Efficient Reinforcement Learning in Factored MDPs
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
An Improved On-line Algorithm for Learning Linear Evaluation Functions
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
An object-oriented representation for efficient reinforcement learning
Proceedings of the 25th international conference on Machine learning
Knows what it knows: a framework for self-aware learning
Proceedings of the 25th international conference on Machine learning
The many faces of optimism: a unifying approach
Proceedings of the 25th international conference on Machine learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Improving action selection in MDP's via knowledge transfer
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Efficient structure learning in factored-state MDPs
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
The first probabilistic track of the international planning competition
Journal of Artificial Intelligence Research
Learning symbolic models of stochastic domains
Journal of Artificial Intelligence Research
STRIPS: a new approach to the application of theorem proving to problem solving
IJCAI'71 Proceedings of the 2nd international joint conference on Artificial intelligence
A unifying framework for computational reinforcement learning theory
A unifying framework for computational reinforcement learning theory
Reinforcement Learning in Finite MDPs: PAC Analysis
The Journal of Machine Learning Research
A contextual-bandit approach to personalized news article recommendation
Proceedings of the 19th international conference on World wide web
Incremental learning of relational action models in noisy environments
ILP'10 Proceedings of the 20th international conference on Inductive logic programming
Data mining and model trees study on GDP and its influence factors
AIASABEBI'11 Proceedings of the 11th WSEAS international conference on Applied informatics and communications, and Proceedings of the 4th WSEAS International conference on Biomedical electronics and biomedical informatics, and Proceedings of the international conference on Computational engineering in systems applications
Handling ambiguous effects in action learning
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Active learning of relational action models
ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming
Exploration in relational domains for model-based reinforcement learning
The Journal of Machine Learning Research
Interactive collaborative filtering
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
This paper presents a new algorithm for online linear regression whose efficiency guarantees satisfy the requirements of the KWIK (Knows What It Knows) framework. The algorithm improves on the complexity bounds of the current state-of-the-art procedure in this setting. We explore several applications of this algorithm for learning compact reinforcement-learning representations. We show that KWIK linear regression can be used to learn the reward function of a factored MDP and the probabilities of action outcomes in Stochastic STRIPS and Object Oriented MDPs, none of which have been proven to be efficiently learnable in the RL setting before. We also combine KWIK linear regression with other KWIK learners to learn larger portions of these models, including experiments on learning factored MDP transition and reward functions together.