The complexity of Markov decision processes
Mathematics of Operations Research
Closed-loop control with delayed information
SIGMETRICS '92/PERFORMANCE '92 Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Reinforcement learning for robots using neural networks
Reinforcement learning for robots using neural networks
An Upper Bound on the Loss from Approximate Optimal-Value Functions
Machine Learning
Reinforcement learning with replacing eligibility traces
Machine Learning - Special issue on reinforcement learning
Locally Weighted Learning for Control
Artificial Intelligence Review - Special issue on lazy learning
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Dynamic Programming and Optimal Control
Dynamic Programming and Optimal Control
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Kernel-Based Reinforcement Learning
Machine Learning
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Rates of Convergence for Variable Resolution Schemes in Optimal Control
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Locally Weighted Projection Regression: Incremental Real Time Learning in High Dimensional Space
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Algorithms for sequential decision-making
Algorithms for sequential decision-making
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
PAC model-free reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
A POMDP approximation algorithm that anticipates the need to observe
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Applying hybrid learning approach to RoboCup's strategy
Journal of Systems and Software
Hi-index | 0.02 |
This work considers the problems of learning and planning in Markovian environments with constant observation and reward delays. We provide a hardness result for the general planning problem and positive results for several special cases with deterministic or otherwise constrained dynamics. We present an algorithm, Model Based Simulation, for planning in such environments and use model-based reinforcement learning to extend this approach to the learning setting in both finite and continuous environments. Empirical comparisons show this algorithm holds significant advantages over others for decision making in delayed-observation environments.