Bucket elimination: a unifying framework for reasoning
Artificial Intelligence
Brains, Behavior and Robotics
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
An efficient algorithm for finding the M most probable configurationsin probabilistic expert systems
Statistics and Computing
Coordinated Reinforcement Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Temporal abstraction in reinforcement learning
Temporal abstraction in reinforcement learning
A hybrid architecture for adaptive robot control
A hybrid architecture for adaptive robot control
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Distributed planning in hierarchical factored MDPs
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Partial Order Hierarchical Reinforcement Learning
AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Learning Representation and Control in Markov Decision Processes: New Frontiers
Foundations and Trends® in Machine Learning
Hi-index | 0.00 |
We study an approach for performing concurrent activities in Markov decision processes (MDPs) based on the coarticulation framework. We assume that the agent has multiple degrees of freedom (DOF) in the action space which enables it to perform activities simultaneously. We demonstrate that one natural way for generating concurrency in the system is by coarticulating among the set of learned activities available to the agent. In general due to the multiple DOF in the system, often there exists a redundant set of admissible sub-optimal policies associated with each learned activity. Such flexibility enables the agent to concurrently commit to several subgoals according to their priority levels, given a new task defined in terms of a set of prioritized subgoals. We present efficient approximate algorithms for computing such policies and for generating concurrent plans. We also evaluate our approach in a simulated domain.