Lazy Planning under Uncertainty by Optimizing Decisions on an Ensemble of Incomplete Disturbance Trees

Authors:
Boris Defourny;Damien Ernst;Louis Wehenkel
Affiliations:
Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium B-4000;Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium B-4000;Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium B-4000
Venue:
Recent Advances in Reinforcement Learning
Year:
2008

Citing 15
Cited 0

The Strength of Weak Learnability

Machine Learning
A natural semantics for lazy evaluation

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Bagging predictors

Machine Learning
Robust constrained model predictive control using linear matrix inequalities

Automatica (Journal of IFAC)
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes

Machine Learning
Quantitative Stability in Stochastic Programming: The Method of Probability Metrics

Mathematics of Operations Research
PEGASUS: A policy search method for large MDPs and POMDPs

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Dynamic Programming

Dynamic Programming
Generating Scenario Trees for Multistage Decision Problems

Management Science
Tree-Based Batch Mode Reinforcement Learning

The Journal of Machine Learning Research
Stability of Multistage Stochastic Programs

SIAM Journal on Optimization
Confidence level solutions for stochastic programming

Automatica (Journal of IFAC)
Reinforcement learning versus model predictive control: a comparison on a power system problem

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Lazy decision trees

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the problem of solving discrete-time optimal sequential decision making problems having a disturbance space W composed of a finite number of elements. In this context, the problem of finding from an initial state x 0 an optimal decision strategy can be stated as an optimization problem which aims at finding an optimal combination of decisions attached to the nodes of a disturbance tree modeling all possible sequences of disturbances w 0 , w 1 , ..., $w_{T-1} \in W^T$ over the optimization horizon T . A significant drawback of this approach is that the resulting optimization problem has a search space which is the Cartesian product of O (|W | T *** 1) decision spaces U , which makes the approach computationally impractical as soon as the optimization horizon grows, even if W has just a handful of elements. To circumvent this difficulty, we propose to exploit an ensemble of randomly generated incomplete disturbance trees of controlled complexity, to solve their induced optimization problems in parallel, and to combine their predictions at time t = 0 to obtain a (near-)optimal first-stage decision. Because this approach postpones the determination of the decisions for subsequent stages until additional information about the realization of the uncertain process becomes available, we call it lazy . Simulations carried out on a robot corridor navigation problem show that even for small incomplete trees, this approach can lead to near-optimal decisions.