Experimental analysis on Sarsa(λ) and Q(λ) with different eligibility traces strategies

Authors:
Jinsong Leng;Colin Fyfe;Lakhmi C. Jain
Affiliations:
(Correspd. E-mail: Jinsong.Leng@unisa.edu.au) School of Electrical and Information Engineering, Knowledge Based Intelligent Engineering Systems Centre, University of South Australia, Mawson Lakes ...;Applied Computational Intelligence Research Unit, University of the West of Scotland, Paisley, Scotland;School of Electrical and Information Engineering, Knowledge Based Intelligent Engineering Systems Centre, University of South Australia, Mawson Lakes SA 5095, Australia
Venue:
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Theoretical advances of intelligent paradigms
Year:
2009

Citing 11
Cited 2

TD(λ) Converges with Probability 1

Machine Learning
Incremental multi-step Q-learning

Machine Learning - Special issue on reinforcement learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Scaling Reinforcement Learning toward RoboCup Soccer

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Keepaway Soccer: A Machine Learning Testbed

RoboCup 2001: Robot Soccer World Cup V
Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer

RoboCup 2001: Robot Soccer World Cup V
Simulation and reinforcement learning with soccer agents

Multiagent and Grid Systems - Innovations in intelligent agent technology
Reinforcement learning of competitive skills with soccer agents

KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part I
Teamwork and simulation in hybrid cognitive architecture

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Function approximation via tile coding: automating parameter choice

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation

Experimental analysis of eligibility traces strategies in temporal difference learning

International Journal of Knowledge Engineering and Soft Data Paradigms
Similarity of learned helplessness in human being and fuzzy reinforcement learning algorithms

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Computational intelligence models for image processing and information reasoning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Temporal difference learning and eligibility traces are two mechanisms for solving reinforcement learning problems. The temporal difference technique bootstraps the state value or state-action value at every step as with dynamic programming, and learns by sampling episodes from experience as in the Monte Carlo approach. Eligibility traces is a mechanism that offers a means for recording the degree of which state is eligible for undergoing learning process. This paper aims to investigate the underlying mechanism of eligibility traces strategies using on-policy and off-policy learning algorithms. In doing so, the performance metrics can be obtained by defining the learning problem in a simulation environment, in conjunction with different learning algorithms. However, measuring learning performance and analysing sensibility are very expensive because such performance metrics can only be obtained by running an experiment with different parameter values. This paper proposes a comparative study for analysing the mechanism of eligibility traces. The objective of this paper is to compare and investigate the influences on performance caused by those different approaches.