Optimal control: linear quadratic methods
Optimal control: linear quadratic methods
Robust and optimal control
Locally Weighted Learning for Control
Artificial Intelligence Review - Special issue on lazy learning
Dynamic Programming and Optimal Control, Two Volume Set
Dynamic Programming and Optimal Control, Two Volume Set
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Robot Learning From Demonstration
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Acquisition of Stand-up Behavior by a Real Robot using Hierarchical Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Using inaccurate models in reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Machine learning for fast quadrupedal locomotion
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Using inaccurate models in reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Nonlinear dynamics modelling for controller evolution
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Artificial Intelligence techniques: An introduction to their use for modelling environmental systems
Mathematics and Computers in Simulation
Learning for control from multiple demonstrations
Proceedings of the 25th international conference on Machine learning
Proceedings of the 25th international conference on Machine learning
Adaptive Optimal Control for Redundantly Actuated Arms
SAB '08 Proceedings of the 10th international conference on Simulation of Adaptive Behavior: From Animals to Animats
Probabilistic Inference for Fast Learning in Control
Recent Advances in Reinforcement Learning
Apprenticeship learning for helicopter control
Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
A reward field model generation in Q-learning by dynamic programming
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
Comparing apples and oranges through partial orders: an empirical approach
ACC'09 Proceedings of the 2009 conference on American Control Conference
Autonomous Helicopter Aerobatics through Apprenticeship Learning
International Journal of Robotics Research
Integrating a partial model into model free reinforcement learning
The Journal of Machine Learning Research
Humanoid robots learning to walk faster: from the real world to simulation and back
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Reinforcement learning in robotics: A survey
International Journal of Robotics Research
Autonomously learning to visually detect where manipulation will succeed
Autonomous Robots
Hi-index | 0.00 |
In the model-based policy search approach to reinforcement learning (RL), policies are found using a model (or "simulator") of the Markov decision process. However, for high-dimensional continuous-state tasks, it can be extremely difficult to build an accurate model, and thus often the algorithm returns a policy that works in simulation but not in real-life. The other extreme, model-free RL, tends to require infeasibly large numbers of real-life trials. In this paper, we present a hybrid algorithm that requires only an approximate model, and only a small number of real-life trials. The key idea is to successively "ground" the policy evaluations using real-life trials, but to rely on the approximate model to suggest local changes. Our theoretical results show that this algorithm achieves near-optimal performance in the real system, even when the model is only approximate. Empirical results also demonstrate that---when given only a crude model and a small number of real-life trials---our algorithm can obtain near-optimal performance in the real system.