Proceedings of the seventh international conference (1990) on Machine learning
Practical Issues in Temporal Difference Learning
Machine Learning
Reinforcement learning with replacing eligibility traces
Machine Learning - Special issue on reinforcement learning
Incremental multi-step Q-learning
Machine Learning - Special issue on reinforcement learning
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Artificial Intelligence: A Modern Approach
Artificial Intelligence: A Modern Approach
Temporal credit assignment in reinforcement learning
Temporal credit assignment in reinforcement learning
Journal of Artificial Intelligence Research
Model-based exploration in continuous state spaces
SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Reinforcement Learning with Reward Shaping and Mixed Resolution Function Approximation
International Journal of Agent Technologies and Systems
Hi-index | 0.00 |
In the paper the robustness of SARSA(λ), thereinforcement learning algorithm with eligibility traces, isconfronted with different models of reward and initialisation ofthe Q-table. Most of the empirical analyses of eligibility tracesin the literature have focused mainly on the step-penalty reward.We analyse two general types of rewards (final goal andstep-penalty rewards) and show that learning with long traces,i.e., with high values of λ, can lead to suboptimalsolutions in some situations. Problems are identified anddiscussed. Specifically, obtained results show thatSARSA(λ) is sensitive to different models of rewardand initialisation. In some cases the asymptotic performance can besignificantly reduced.