First Results with Utile Distinction Memory for Reinforcement Learning

  • Authors:
  • R. A McCallum

  • Affiliations:
  • -

  • Venue:
  • First Results with Utile Distinction Memory for Reinforcement Learning
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

This report presents a method by which a reinforcement learning agent can solve the incomplete perception problem using memory. The agent uses a Hidden Markov Model (HMM) to represent its internal state space and creates memory capacity by splitting states of the HMM. The key idea is a test to determine when and how a state should be split: the agent only splits a state when the split will help the agent predict utility. Thus the agent can build an internal state space proportionate to the task at hand, not as large as would be required to represent all of its perceivable world. I call the technique UDM, for Utile Distinction Memory.