Theoretical Computer Science - Tools and algorithms for the construction and analysis of systems (TACAS 2004)
Dynamic Programming and Optimal Control, Vol. II
Dynamic Programming and Optimal Control, Vol. II
On the Numerical Analysis of Inhomogeneous Continuous-Time Markov Chains
INFORMS Journal on Computing
Model checking algorithms for CTMDPs
CAV'11 Proceedings of the 23rd international conference on Computer aided verification
Observing continuous-time MDPs by 1-clock timed automata
RP'11 Proceedings of the 5th international conference on Reachability problems
Quantitative timed analysis of interactive markov chains
NFM'12 Proceedings of the 4th international conference on NASA Formal Methods
Compositional verification and optimization of interactive markov chains
CONCUR'13 Proceedings of the 24th international conference on Concurrency Theory
Hi-index | 0.01 |
Continuous time Markov decision processes (CTMDPs) with a finite state and action space have been considered for a long time. It is known that under fairly general conditions the reward gained over a finite horizon can be maximized by a so-called piecewise constant policy which changes only finitely often in a finite interval. Although this result is available for more than 30 years, numerical analysis approaches to compute the optimal policy and reward are restricted to discretization methods which are known to converge to the true solution if the discretization step goes to zero. In this paper, we present a new method that is based on uniformization of the CTMDP and allows one to compute an @e-optimal policy up to a predefined precision in a numerically stable way using adaptive time steps.