Fast Online Q(λ)

  • Authors:
  • Marco Wiering;Jürgen Schmidhuber

  • Affiliations:
  • IDSIA, Corso Elvezia 36, 6900 Lugano, Switzerland. E-mail: Email: marco@idsia.ch;IDSIA, Corso Elvezia 36, 6900 Lugano, Switzerland. E-mail: Email: juergen@idsia.ch

  • Venue:
  • Machine Learning
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Q(λ)-learning uses TD(λ)-methods toaccelerate Q-learning. The update complexity of previous onlineQ(λ) implementations based on lookup tables is bounded by thesize of the state/action space. Our faster algorithm‘s updatecomplexity is bounded by the number of actions. The method is basedon the observation that Q-value updates may be postponed until theyare needed.