Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Understanding Intelligence
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Artificial Intelligence: A Modern Approach
Artificial Intelligence: A Modern Approach
Autonomous Robots
Dynamic abstraction in reinforcement learning via clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Planning Algorithms
On the use of Bayesian Networks to develop behaviours for mobile robots
Robotics and Autonomous Systems
Multi-thresholded approach to demonstration selection for interactive robot learning
Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction
Learning forward models for robots
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Acquisition of intermediate goals for an agent executing multiple tasks
IEEE Transactions on Robotics
Hi-index | 0.00 |
A Skilligent robot must be able to learn skills autonomously to accomplish a task. "Skilligence" is the capacity of the robot to control behaviors reasonably, based on the skills acquired during run-time. Behavior control based on Bayesian networks is used to control reasonable behaviors. To accomplish this, subgoals are first discovered by clustering similar features of state transition tuples, which are composed of current states, actions, and next states. Here, features used in clustering are produced using changes of the states in the state transition tuples. Parameters of Bayesian networks and utility functions are learned separately using state transition tuples belonging to each subgoal. To select the best action while executing a task, the expected utility of each subgoal is calculated by the expected utility function and the robot chooses the action that maximizes expected utility calculated by the maximum expected utility (MEU) function. The MEU function is based on the conditional probabilistic distributions of Bayesian networks and utility functions. We also propose a method for reconstructing learned networks and increasing subgoals by incremental learning. To show the validities of our proposed methods, a task using Dribbling-Box-Into-a-Goal (DBIG) and Obstacle-Avoidance-While-Dribbling-Box (OAWDB) skills is simulated and experimented.