Towards analysis of semi-Markov decision processes

Authors:
Taolue Chen;Jian Lu
Affiliations:
FMT, University of Twente, The Netherlands;State Key Laboratory, Novel Software Technology, Nanjing University, China
Venue:
AICI'10 Proceedings of the 2010 international conference on Artificial intelligence and computational intelligence: Part I
Year:
2010

Citing 8
Cited 0

An analysis of stochastic shortest path problems

Mathematics of Operations Research
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
How to Specify and Verify the Long-Run Average Behavior of Probabilistic Systems

LICS '98 Proceedings of the 13th Annual IEEE Symposium on Logic in Computer Science
Model-Checking Algorithms for Continuous-Time Markov Chains

IEEE Transactions on Software Engineering
Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes

Theoretical Computer Science - Tools and algorithms for the construction and analysis of systems (TACAS 2004)
Dynamic Programming and Optimal Control, Vol. II

Dynamic Programming and Optimal Control, Vol. II
A characterization of meaningful schedulers for continuous-time markov decision processes

FORMATS'06 Proceedings of the 4th international conference on Formal Modeling and Analysis of Timed Systems
Model checking interactive markov chains

TACAS'10 Proceedings of the 16th international conference on Tools and Algorithms for the Construction and Analysis of Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate Semi-Markov Decision Processes (SMDPs). Two problems are studied, namely, the time-bounded reachability problem and the long-run average fraction of time problem. The former aims to compute the maximal (or minimum) probability to reach a certain set of states within a given time bound. We obtain a Bellman equation to characterize the maximal time-bounded reachability probability, and suggest two approaches to solve it based on discretization and randomized techniques respectively. The latter aims to compute the maximal (or minimum) average amount of time spent in a given set of states during the long run. We exploit a graph-theoretic decomposition of the given SMDP based on maximal end components and reduce it to linear programming problems.