Fast Online Q(λ)

Authors:
Marco Wiering;Jürgen Schmidhuber
Affiliations:
IDSIA, Corso Elvezia 36, 6900 Lugano, Switzerland. E-mail: Email: marco@idsia.ch;IDSIA, Corso Elvezia 36, 6900 Lugano, Switzerland. E-mail: Email: juergen@idsia.ch
Venue:
Machine Learning
Year:
1998

Citing 11
Cited 8

Technical Note: \cal Q-Learning

Machine Learning
Reinforcement learning for the adaptive control of perception and action

Reinforcement learning for the adaptive control of perception and action
Reinforcement learning for robots using neural networks

Reinforcement learning for robots using neural networks
Reinforcement learning with replacing eligibility traces

Machine Learning - Special issue on reinforcement learning
The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms

Machine Learning - Special issue on reinforcement learning
Incremental multi-step Q-learning

Machine Learning - Special issue on reinforcement learning
Locally Weighted Learning

Artificial Intelligence Review - Special issue on lazy learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Efficient Exploration In Reinforcement Learning

Efficient Exploration In Reinforcement Learning
Truncating temporal differences: on the efficient implementation of TD (λ) for reinforcement learning

Journal of Artificial Intelligence Research

Reinforcement Learning Soccer Teams with Incomplete World Models

Autonomous Robots
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

Neural Computation
Fast concurrent reinforcement learners

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
From Q(λ) to average Q-learning: efficient implementation of an asymptotic approximation

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Learning efficient policies for vision-based navigation

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
Efficient vision-based navigation

Autonomous Robots
Exploiting Best-Match Equations for Efficient Reinforcement Learning

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Q(λ)-learning uses TD(λ)-methods toaccelerate Q-learning. The update complexity of previous onlineQ(λ) implementations based on lookup tables is bounded by thesize of the state/action space. Our faster algorithm‘s updatecomplexity is bounded by the number of actions. The method is basedon the observation that Q-value updates may be postponed until theyare needed.