Enhanced temporal difference learning using compiled eligibility traces

  • Authors:
  • Peter Vamplew;Robert Ollington;Mark Hepburn

  • Affiliations:
  • School of Information Technology and Mathematical Sciences, University of Ballarat, Ballarat, Victoria, Australia;School of Computing, University of Tasmania, Hobart, Tasmania, Australia;School of Computing, University of Tasmania, Hobart, Tasmania, Australia

  • Venue:
  • AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Eligibility traces have been shown to substantially improve the convergence speed of temporal difference learning algorithms, by maintaining a record of recently experienced states. This paper presents an extension of conventional eligibility traces (compiled traces) which retain additional information about the agent's experience within the environment. Empirical results show that compiled traces outperform conventional traces when applied to policy evaluation tasks using a tabular representation of the state values.