A model for reasoning about persistence and causation
Computational Intelligence
Feature-based methods for large scale dynamic programming
Machine Learning - Special issue on reinforcement learning
The sciences of the artificial (3rd ed.)
The sciences of the artificial (3rd ed.)
Abstraction and approximate decision-theoretic planning
Artificial Intelligence
Bucket elimination: a unifying framework for reasoning
Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Nonserial Dynamic Programming
Computing Factored Value Functions for Policies in Structured MDPs
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Policy Iteration for Factored MDPs
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Context-specific multiagent coordination and planning with factored MDPs
Eighteenth national conference on Artificial intelligence
On policy iteration as a Newton's method and polynomial policy iteration algorithms
Eighteenth national conference on Artificial intelligence
Greedy linear value-approximation for factored Markov decision processes
Eighteenth national conference on Artificial intelligence
Piecewise linear value function approximation for factored MDPs
Eighteenth national conference on Artificial intelligence
Efficient max-norm distance computation and reliable voxelization
Proceedings of the 2003 Eurographics/ACM SIGGRAPH symposium on Geometry processing
Multi-Agent Planning in Complex Uncertain Environments
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Solving factored MDPs with continuous and discrete variables
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
A causal approach to hierarchical decomposition of factored MDPs
ICML '05 Proceedings of the 22nd international conference on Machine learning
Constraint-based optimization and utility elicitation using the minimax decision criterion
Artificial Intelligence
Causal Graph Based Decomposition of Factored MDPs
The Journal of Machine Learning Research
APPSSAT: Approximate probabilistic planning using stochastic satisfiability
International Journal of Approximate Reasoning
Continuous State Dynamic Programming via Nonexpansive Approximation
Computational Economics
Error bounds for approximate value iteration
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
Solving factored MDPs with hybrid state and action variables
Journal of Artificial Intelligence Research
Planning with durative actions in stochastic domains
Journal of Artificial Intelligence Research
An MCMC approach to solving hybrid factored MDPs
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Constraint-based optimization and utility elicitation using the minimax decision criterion
Artificial Intelligence
Active learning of dynamic Bayesian networks in Markov decision processes
SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Automatic induction of bellman-error features for probabilistic planning
Journal of Artificial Intelligence Research
Polynomial value iteration algorithms for deterministic MDPs
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Reinforcement learning with partially known world dynamics
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Inductive policy selection for first-order MDPs
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Monte-Carlo optimizations for resource allocation problems in stochastic network systems
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Decentralized Bayesian reinforcement learning for online agent collaboration
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Rational market making with probabilistic knowledge
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Construction of approximation spaces for reinforcement learning
The Journal of Machine Learning Research
Hi-index | 0.00 |
Markov Decision Processes (MDPs) provide a coherent mathematical framework for planning under uncertainty. However, exact MDP solution algorithms require the manipulation of a value function, which specifies a value for each state in the system. Most real-world MDPs are too large for such a representation to be feasible, preventing the use of exact MDP algorithms. Various approximate solution algorithms have been proposed, many of which use a linear combination of basis functions as a compact approximation to the value function. Almost all of these algorithms use an approximation based on the (weighted) L2-norm (Euclidean distance); this approach prevents the application of standard convergence results for MDP algorithms, all of which are based on max-norm. This paper makes two contributions. First, it presents the first approximate MDP solution algorithms - both value and policy iteration - that use max-norm projection, thereby directly optimizing the quantity required to obtain the best error bounds. Second, it shows how these algorithms can be applied efficiently in the context of factored MDPs, where the transition model is specified using a dynamic Bayesian network.