Bayesian network-based behavior control for Skilligent robots

Authors:
Sang Hyoung Lee;Il Hong Suh
Affiliations:
College of Information and Communications, Hanyang University, Seoul, Korea;College of Information and Communications, Hanyang University, Seoul, Korea
Venue:
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Year:
2009

Citing 11
Cited 0

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Understanding Intelligence

Understanding Intelligence
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Bayesian Robot Programming

Autonomous Robots
Dynamic abstraction in reinforcement learning via clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Planning Algorithms

Planning Algorithms
On the use of Bayesian Networks to develop behaviours for mobile robots

Robotics and Autonomous Systems
Multi-thresholded approach to demonstration selection for interactive robot learning

Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction
Learning forward models for robots

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Acquisition of intermediate goals for an agent executing multiple tasks

IEEE Transactions on Robotics

Quantified Score

Hi-index	0.00

Visualization

Abstract

A Skilligent robot must be able to learn skills autonomously to accomplish a task. "Skilligence" is the capacity of the robot to control behaviors reasonably, based on the skills acquired during run-time. Behavior control based on Bayesian networks is used to control reasonable behaviors. To accomplish this, subgoals are first discovered by clustering similar features of state transition tuples, which are composed of current states, actions, and next states. Here, features used in clustering are produced using changes of the states in the state transition tuples. Parameters of Bayesian networks and utility functions are learned separately using state transition tuples belonging to each subgoal. To select the best action while executing a task, the expected utility of each subgoal is calculated by the expected utility function and the robot chooses the action that maximizes expected utility calculated by the maximum expected utility (MEU) function. The MEU function is based on the conditional probabilistic distributions of Bayesian networks and utility functions. We also propose a method for reconstructing learned networks and increasing subgoals by incremental learning. To show the validities of our proposed methods, a task using Dribbling-Box-Into-a-Goal (DBIG) and Obstacle-Avoidance-While-Dribbling-Box (OAWDB) skills is simulated and experimented.