Reachability in recursive Markov decision processes
Information and Computation
Approximative Methods for Monotone Systems of Min-Max-Polynomial Equations
ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part I
Recursive Stochastic Games with Positive Rewards
ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part I
Recursive Markov chains, stochastic grammars, and monotone systems of nonlinear equations
Journal of the ACM (JACM)
On the Complexity of Numerical Analysis
SIAM Journal on Computing
Computing the Least Fixed Point of Positive Polynomial Systems
SIAM Journal on Computing
Recursive markov decision processes and recursive stochastic games
ICALP'05 Proceedings of the 32nd international conference on Automata, Languages and Programming
Polynomial time algorithms for multi-type branching processesand stochastic context-free grammars
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Stochastic context-free grammars, regular languages, and newton's method
ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part II
Verification of linear duration properties over continuous-time markov chains
ACM Transactions on Computational Logic (TOCL)
Hi-index | 0.00 |
We show that one can approximate the least fixed point solution for a multivariate system of monotone probabilistic max (min) polynomial equations, in time polynomial in both the encoding size of the system of equations and in log(1/ε), where ε0 is the desired additive error bound of the solution. (The model of computation is the standard Turing machine model.) These equations form the Bellman optimality equations for several important classes of infinite-state Markov Decision Processes (MDPs). Thus, as a corollary, we obtain the first polynomial time algorithms for computing to within arbitrary desired precision the optimal value vector for several classes of infinite-state MDPs which arise as extensions of classic, and heavily studied, purely stochastic processes. These include both the problem of maximizing and minimizing the termination (extinction) probability of multi-type branching MDPs, stochastic context-free MDPs, and 1-exit Recursive MDPs. We also show that we can compute in P-time an ε-optimal policy for any given desired ε0.