Causal Graph Based Decomposition of Factored MDPs
The Journal of Machine Learning Research
On the hardness of finding symmetries in Markov decision processes
Proceedings of the 25th international conference on Machine learning
Reinforcement learning for problems with symmetrical restricted states
Robotics and Autonomous Systems
Reinforcement Learning in Nonstationary Environment Navigation Tasks
CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Transfer via soft homomorphisms
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Decision tree methods for finding reusable MDP homomorphisms
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Lossy stochastic game abstraction with bounds
Proceedings of the 13th ACM Conference on Electronic Commerce
Apprenticeship learning with few examples
Neurocomputing
Hi-index | 0.00 |
To operate effectively in complex environments learning agents ignore irrelevant details. Stated in general terms this is a very difficult problem. Much of the work in this field is specialized to specific modeling framework for Markov decision processes (MDPs) based on homomorphisms relating MDPs. We build on classical finite-state automata literature and develop a minimization framework for MDPs that can exploit structure and symmetries to derive smaller equivalent models of the problem. Since employing homomorphisms approximate and partial homomorphisms and develop bounds for the loss that Our MDP minimization results can be readily employed by reinforcement approach to hierarchical RL, specifically using the options framework. We introduce relativized options, a generalization of Markov sub-goal options, that allow us to define options without an absolute frame of reference. We introduce an extension to the options framework, based on relativized options, that allows us to learn simultaneously at multiple levels of the hierarchy guarantees regarding the performance of hierarchical systems that employ approximate in several test-beds. Relativized options can also be interpreted as behavioral schemas. We demonstrate that such schemas can be profitably employed in a hierarchical RL setting. We also develop algorithms that learn the appropriate parameter binding to a given schema. We empirically demonstrate the validity and utility of these algorithms. Relativized options allow us to model certain aspects of deictic or indexical representations. We develop a modification of our parameter binding algorithm suited to hierarchical RL architectures that employ deictic representations.