Statecharts: A visual formalism for complex systems
Science of Computer Programming
A model for reasoning about persistence and causation
Computational Intelligence
Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Discovering Hierarchy in Reinforcement Learning with HEXQ
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Continuous-Time Hierarchical Reinforcement Learning
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Efficient Reinforcement Learning in Factored MDPs
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Symbolic heuristic search for factored Markov decision processes
Eighteenth national conference on Artificial intelligence
Hierarchical control and learning for markov decision processes
Hierarchical control and learning for markov decision processes
Using relative novelty to identify useful temporal abstractions in reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Dynamic abstraction in reinforcement learning via clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
An algebraic approach to abstraction in reinforcement learning
An algebraic approach to abstraction in reinforcement learning
A causal approach to hierarchical decomposition of factored MDPs
ICML '05 Proceedings of the 22nd international conference on Machine learning
Identifying useful subgoals in reinforcement learning by local graph partitioning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Max-norm projections for factored MDPs
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Exploiting structure in policy construction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Decomposition techniques for planning in stochastic domains
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Model minimization in Markov decision processes
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
SPUDD: stochastic planning using decision diagrams
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Unsupervised active learning in large domains
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Hierarchical solution of Markov decision processes using macro-actions
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Symbolic generalization for on-line planning
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Automatic discovery and transfer of MAXQ hierarchies
Proceedings of the 25th international conference on Machine learning
Learning MDP Action Models Via Discrete Mixture Trees
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Discovering options from example trajectories
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Autonomously learning an action hierarchy using a learned qualitative state representation
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Active learning of dynamic Bayesian networks in Markov decision processes
SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Optimal policy switching algorithms for reinforcement learning
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
TeXDYNA: hierarchical reinforcement learning in factored MDPs
SAB'10 Proceedings of the 11th international conference on Simulation of adaptive behavior: from animals to animats
Automatic construction of temporally extended actions for MDPs using bisimulation metrics
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Hi-index | 0.00 |
We present Variable Influence Structure Analysis, or VISA, an algorithm that performs hierarchical decomposition of factored Markov decision processes. VISA uses a dynamic Bayesian network model of actions, and constructs a causal graph that captures relationships between state variables. In tasks with sparse causal graphs VISA exploits structure by introducing activities that cause the values of state variables to change. The result is a hierarchy of activities that together represent a solution to the original task. VISA performs state abstraction for each activity by ignoring irrelevant state variables and lower-level activities. In addition, we describe an algorithm for constructing compact models of the activities introduced. State abstraction and compact activity models enable VISA to apply efficient algorithms to solve the stand-alone subtask associated with each activity. Experimental results show that the decomposition introduced by VISA can significantly accelerate construction of an optimal, or near-optimal, policy.