Reducing the memory footprint of temporal difference learning over finitely many states by using case-based generalization

  • Authors:
  • Matt Dilts;Héctor Muñoz-Avila

  • Affiliations:
  • Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA;Department of Computer Science and Engineering, Lehigh University, Bethlehem, PA

  • Venue:
  • ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present an approach for reducing the memory footprint requirement of temporal difference methods in which the set of states is finite. We use case-based generalization to group the states visited during the reinforcement learning process. We follow a lazy learning approach; cases are grouped in the order in which they are visited. Any new state visited is assigned to an existing entry in the Q-table provided that a similar state has been visited before. Otherwise a new entry is added to the Q-table. We performed experiments on a turn-based game where actions have non-deterministic effects and might have long term repercussions on the outcome of the game. The main conclusion from our experiments is that by using case-based generalization, the size of the Q-table can be substantially reduced while maintaining the quality of the RL estimates.