Solving factored MDPs with continuous and discrete variables

Authors:
Carlos Guestrin;Milos Hauskrecht;Branislav Kveton
Affiliations:
Intel Corporation;University of Pittsburgh;University of Pittsburgh
Venue:
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Year:
2004

Citing 10
Cited 13

A model for reasoning about persistence and causation

Computational Intelligence
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Computing Factored Value Functions for Policies in Structured MDPs

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Dynamic Programming

Dynamic Programming
Learning and value function approximation in complex decision processes

Learning and value function approximation in complex decision processes
The Linear Programming Approach to Approximate Dynamic Programming

Operations Research
On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming

Mathematics of Operations Research
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Max-norm projections for factored MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees

Mathematics of Operations Research
The Influence of Influence Diagrams on Artificial Intelligence

Decision Analysis
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
Practical solution techniques for first-order MDPs

Artificial Intelligence
Learning basis functions in hybrid domains

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Planning and execution with phase transitions

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Towards faster planning with continuous resources in stochastic domains

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Solving factored MDPs with hybrid state and action variables

Journal of Artificial Intelligence Research
Planning with continuous resources in stochastic domains

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
An MCMC approach to solving hybrid factored MDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Planning under Uncertainty for Robotic Tasks with Mixed Observability

International Journal of Robotics Research
Planning in stochastic domains for multiple agents with individual continuous resource state-spaces

Autonomous Agents and Multi-Agent Systems
Construction of approximation spaces for reinforcement learning

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although many real-world stochastic planning problems are more naturally formulated by hybrid models with both discrete and continuous variables, current state-of-the-art methods cannot adequately address these problems. We present the first framework that can exploit problem structure for modeling and solving hybrid problems efficiently. We formulate these problems as hybrid Markov decision processes (MDPs with continuous and discrete state and action variables), which we assume can be represented in a factored way using a hybrid dynamic Bayesian network (hybrid DBN). This formulation also allows us to apply our methods to collaborative multiagent settings. We present a new linear program approximation method that exploits the structure of the hybrid MDP and lets us compute approximate value functions more efficiently. In particular, we describe a new factored discretization of continuous variables that avoids the exponential blow-up of traditional approaches. We provide theoretical bounds on the quality of such an approximation and on its scale-up potential. We support our theoretical arguments with experiments on a set of control problems with up to 28-dimensional continuous state space and 22-dimensional action space.