SarsaLandmark: an algorithm for learning in POMDPs with landmarks

  • Authors:
  • Michael R. James;Satinder Singh

  • Affiliations:
  • Toyota Research Institute NA, Ann Arbor, MI;University of Michigan, Ann Arbor, MI

  • Venue:
  • Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in partially observable Markov decision processes (POMDPs). Nevertheless, one can construct counterexamples, problems in which Sarsa(λ