TD(λ) Converges with Probability 1
Machine Learning
Incremental multi-step Q-learning
Machine Learning - Special issue on reinforcement learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Scaling Reinforcement Learning toward RoboCup Soccer
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Keepaway Soccer: A Machine Learning Testbed
RoboCup 2001: Robot Soccer World Cup V
Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer
RoboCup 2001: Robot Soccer World Cup V
Simulation and reinforcement learning with soccer agents
Multiagent and Grid Systems - Innovations in intelligent agent technology
Reinforcement learning of competitive skills with soccer agents
KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part I
Teamwork and simulation in hybrid cognitive architecture
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Function approximation via tile coding: automating parameter choice
SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Experimental analysis of eligibility traces strategies in temporal difference learning
International Journal of Knowledge Engineering and Soft Data Paradigms
Similarity of learned helplessness in human being and fuzzy reinforcement learning algorithms
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - Computational intelligence models for image processing and information reasoning
Hi-index | 0.00 |
Temporal difference learning and eligibility traces are two mechanisms for solving reinforcement learning problems. The temporal difference technique bootstraps the state value or state-action value at every step as with dynamic programming, and learns by sampling episodes from experience as in the Monte Carlo approach. Eligibility traces is a mechanism that offers a means for recording the degree of which state is eligible for undergoing learning process. This paper aims to investigate the underlying mechanism of eligibility traces strategies using on-policy and off-policy learning algorithms. In doing so, the performance metrics can be obtained by defining the learning problem in a simulation environment, in conjunction with different learning algorithms. However, measuring learning performance and analysing sensibility are very expensive because such performance metrics can only be obtained by running an experiment with different parameter values. This paper proposes a comparative study for analysing the mechanism of eligibility traces. The objective of this paper is to compare and investigate the influences on performance caused by those different approaches.