Automatically generating abstractions for problem solving
Automatically generating abstractions for problem solving
Tree based discretization for continuous state space reinforcement learning
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Direct Policy Search using Paired Statistical Tests
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Discovering Hierarchy in Reinforcement Learning with HEXQ
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
The MAXQ Method for Hierarchical Reinforcement Learning
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Intra-Option Learning about Temporally Abstract Actions
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
PEGASUS: A policy search method for large MDPs and POMDPs
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Autonomous discovery of temporal abstractions from interaction with an environment
Autonomous discovery of temporal abstractions from interaction with an environment
Tree based hierarchical reinforcement learning
Tree based hierarchical reinforcement learning
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Variable resolution discretization for high-accuracy solutions of optimal control problems
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Input generalization in delayed reinforcement learning: an algorithm and performance comparisons
IJCAI'91 Proceedings of the 12th international joint conference on Artificial intelligence - Volume 2
Hi-index | 0.00 |
In this chapter we describe the Trajectory Tree, or TTree, algorithm. TTree uses a small set of supplied policies to help solve a Semi-Markov Decision Problem (SMDP). The algorithm uses a learned tree based discretization of the state space as an abstract state description and both user supplied and auto-generated policies as temporally abstract actions. It uses a generative model of the world to sample the transition function for the abstract SMDP defined by those state and temporal abstractions, and then finds a policy for that abstract SMDP. This policy for the abstract SMDP can then be mapped back to a policy for the base SMDP, solving the supplied problem. In this chapter we present the TTree algorithm and give empirical comparisons to other SMDP algorithms showing its effectiveness.