A causal approach to hierarchical decomposition in reinforcement learning

  • Authors:
  • Andrew G. Barto;Anders Jonsson

  • Affiliations:
  • University of Massachusetts Amherst;University of Massachusetts Amherst

  • Venue:
  • A causal approach to hierarchical decomposition in reinforcement learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reinforcement learning provides a means for autonomous agents to improve their action selection strategies without the need for explicit training information provided by an informed instructor. Theoretical and empirical results indicate that reinforcement learning algorithms can efficiently determine optimal or approximately optimal policies in tasks of limited size. However, as the size of a task grows, reinforcement learning algorithms become less consistent and less efficient at determining a useful policy. A key challenge in reinforcement learning is to develop methods that facilitate scaling reinforcement learning algorithms up to larger, more realistic tasks. We present a series of algorithms that take advantage of task structure to make reinforcement learning more efficient in realistic tasks that display such structure. In each algorithm, we assume that the state space of a task is factored; i.e., states are collections of values of a set of state variables. Our work combines hierarchical decomposition and state abstraction to reduce the size of a task prior to applying reinforcement learning. Hierarchical decomposition breaks a task into several subtasks that can be solved separately. For hierarchical decomposition to simplify learning, it is critical that each subtask is easier to solve than the overall task. To achieve the goal of simplifying the subtasks, we perform state abstraction separately for each subtask. We begin by presenting an algorithm that uses experience from the environment to dynamically perform state abstraction for each subtask in an existing hierarchy of subtasks. Since our goal is to automate hierarchical decomposition as well as state abstraction, a second algorithm uses a dynamic Bayesian network action representation to automatically decompose a task into a hierarchy of subtasks. In addition, the algorithm provides an efficient way to perform state abstraction for each resulting subtask. A third algorithm constructs compact representations of activities that represent solutions to the subtasks. These compact representations enable the use of planning to efficiently approximate solutions to higher-level subtasks without interacting with the environment. Our fourth and final algorithm provides a means to learn a dynamic Bayesian network representation of actions from experience in tasks for which the representation is not available prior to learning. The dissertation provides a detailed description of each algorithm as well as some theoretical results. We also present empirical results of each algorithm in a series of experiments. In tasks that display certain types of structure, the simplifications introduced by our algorithms significantly improve the performance of reinforcement learning. The results indicate that our algorithms provide a promising approach to make reinforcement learning better suited to solve realistic tasks in which these types of structure are present.