GMSim: a tool for compositional GSMP modeling
Proceedings of the 30th conference on Winter simulation
Theory of Modeling and Simulation
Theory of Modeling and Simulation
Kernel-Based Reinforcement Learning
Machine Learning
Dynamic Programming
Least-squares policy iteration
The Journal of Machine Learning Research
Dynamic programming for structured continuous Markov decision problems
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Tree-Based Batch Mode Reinforcement Learning
The Journal of Machine Learning Research
Evolutionary Function Approximation for Reinforcement Learning
The Journal of Machine Learning Research
VLE: a multimodeling and simulation environment
Proceedings of the 2007 Summer Computer Simulation Conference
Solving generalized semi-Markov decision processes using continuous phase-type distributions
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Hi-index | 0.00 |
Time is a crucial variable in planning and often requires special attention since it introduces a specific structure along with additional complexity, especially in the case of decision under uncertainty. In this paper, after reviewing and comparing MDP frameworks designed to deal with temporal problems, we focus on Generalized Semi-Markov Decision Processes (GSMDP) with observable time. We highlight the inherent structure and complexity of these problems and present the differences with classical reinforcement learning problems. Finally, we introduce a new simulation-based reinforcement learning method for solving GSMDP, bringing together results from simulation-based policy iteration, regression techniques and simulation theory. We illustrate our approach on a subway network control example.