Variable resolution discretization for high-accuracy solutions of optimal control problems

Authors:
Remi Munos;Andrew Moore
Affiliations:
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA;Robotics Institute, Carnegie Mellon University, Pittsburgh, PA
Venue:
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Year:
1999

Citing 5
Cited 14

Numerical methods for stochastic control problems in continuous time

Numerical methods for stochastic control problems in continuous time
The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces

Machine Learning
Barycentric interpolators for continuous space & time reinforcement learning

Proceedings of the 1998 conference on Advances in neural information processing systems II
A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions

Machine Learning
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming

TTree: Tree-Based State Generalization with Temporally Abstract Actions

Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Adaptive Representation Methods for Reinforcement Learning

AI '01 Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
State Space Partition for Reinforcement Learning Based on Fuzzy Min-Max Neural Network

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Part II--Advances in Neural Networks
Making a Robot Learn to Play Soccer Using Reward and Punishment

KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Generating Explanations Based on Markov Decision Processes

MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
TTree: tree-based state generalization with temporally abstract actions

Adaptive agents and multi-agent systems
Planning in partially-observable switching-mode continuous domains

Annals of Mathematics and Artificial Intelligence
Structural knowledge transfer by spatial abstraction for reinforcement learning agents

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Planning under continuous time and resource uncertainty: a challenge for AI

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Solving hybrid markov decision processes

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Monte-Carlo swarm policy search

SIDE'12 Proceedings of the 2012 international conference on Swarm and Evolutionary Computation
Hierarchical task decomposition through symbiosis in reinforcement learning

Proceedings of the 14th annual conference on Genetic and evolutionary computation
Proximity-based non-uniform abstractions for approximate planning

Journal of Artificial Intelligence Research
Sparse gradient-based direct policy search

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV

Quantified Score

Hi-index	0.00

Visualization

Abstract

State abstraction is of central importance in remforcement learning and Markov Decision Processes. This paper studies the case of variable resolution state abstraction for continuous-state, deterministic dynamic control problems in which near-optimal policies are required. We describe variable resolution policy and value function representations based on Kuhn triangulations embedded in a kd-tree. We then consider top-down approaches to choosing which cells to split in order to generate improved policies. We begin with local approaches based on value function properties and policy properties that use only features of individual cells in making splitting choices. Later, by introducing two new non-local measures, influence and variance, we derive a splitting criterion that allows one cell to efficiently take into account its impact on other cells when deciding whether to split. We evaluate the performance of a variety of splitting criteria on many benchmark problems (published on the web), paying careful attention to their number-of-cells versus closeness-to-optimality tradeoff curves.