A new polynomial-time algorithm for linear programming
Combinatorica
Fibonacci heaps and their uses in improved network optimization algorithms
Journal of the ACM (JACM)
The complexity of Markov decision processes
Mathematics of Operations Research
Cyclic games and an algorithm to find minimax cycle means in directed graphs
USSR Computational Mathematics and Mathematical Physics
High probability parallel transitive-closure algorithms
SIAM Journal on Computing
The complexity of stochastic games
Information and Computation
Simple and Fast Algorithms for Linear and Integer Programs with Two Variables Per Inequality
SIAM Journal on Computing
Improved Algorithms for Linear Inequalities with Two Variables per Inequality
SIAM Journal on Computing
A subexponential randomized algorithm for the simple stochastic game problem
Information and Computation
The complexity of mean payoff games on graphs
Theoretical Computer Science
Dynamic Programming and Optimal Control
Dynamic Programming and Optimal Control
All pairs shortest paths using bridging sets and rectangular matrix multiplication
Journal of the ACM (JACM)
Finite State Markovian Decision Processes
Finite State Markovian Decision Processes
Introduction to Algorithms
On policy iteration as a Newton's method and polynomial policy iteration algorithms
Eighteenth national conference on Artificial intelligence
Dynamic Programming
Algorithms for sequential decision-making
Algorithms for sequential decision-making
Complexity results for infinite-horizon markov decision processes
Complexity results for infinite-horizon markov decision processes
Combinatorial structure and randomized subexponential algorithms for infinite games
Theoretical Computer Science
A New Complexity Result on Solving the Markov Decision Problem
Mathematics of Operations Research
On the complexity of policy iteration
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Polynomial value iteration algorithms for deterministic MDPs
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
We present two new algorithms for finding optimal strategies for discounted, infinite-horizon, Deterministic Markov Decision Processes (DMDP). The first one is an adaptation of an algorithm of Young, Tarjan and Orlin for finding minimum mean weight cycles. It runs in O(mn + n2 log n) time, where n is the number of vertices (or states) and m is the number of edges (or actions). The second one is an adaptation of a classical algorithm of Karp for finding minimum mean weight cycles. It runs in O(mn) time. The first algorithm has a slightly slower worst-case complexity, but is faster than the first algorithm in many situations. Both algorithms improve on a recent O(mn2)-time algorithm of Andersson and Vorobyov. We also present a randomized Õ(m1/2n2)-time algorithm for finding Discounted All-Pairs Shortest Paths (DAPSP), improving several previous algorithms.