Reinforcement Learning with Reward Shaping and Mixed Resolution Function Approximation

Authors:
Marek Grzes;Daniel Kudenko
Affiliations:
University of York, UK;University of York, UK
Venue:
International Journal of Agent Technologies and Systems
Year:
2009

Citing 21
Cited 0

Practical Issues in Temporal Difference Learning

Machine Learning
Reinforcement learning with replacing eligibility traces

Machine Learning - Special issue on reinforcement learning
Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Variable Resolution Discretization in Optimal Control

Machine Learning
Layered Learning

ECML '00 Proceedings of the 11th European Conference on Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Feudal Reinforcement Learning

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Multi-Value-Functions: Efficient Automatic Action Hierarchies for Multiple Goal MDPs

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Learning to Drive a Bicycle Using Reinforcement Learning and Shaping

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Temporal credit assignment in reinforcement learning

Temporal credit assignment in reinforcement learning
Control Double Inverted Pendulum by Reinforcement Learning with Double CMAC Network

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 04
Automatic shaping and decomposition of reward functions

Proceedings of the 24th international conference on Machine learning
Robustness Analysis of SARSA(λ): Different Models of Reward and Initialisation

AIMSA '08 Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Truncating temporal differences: on the efficient implementation of TD (λ) for reinforcement learning

Journal of Artificial Intelligence Research
Function approximation via tile coding: automating parameter choice

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Fuzzy inference system learning by reinforcement methods

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
CMAC-based adaptive critic self-learning control

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

A crucial trade-off is involved in the design process when function approximation is used in reinforcement learning. Ideally the chosen representation should allow representing as close as possible an approximation of the value function. However, the more expressive the representation the more training data is needed because the space of candidate hypotheses is bigger. A less expressive representation has a smaller hypotheses space and a good candidate can be found faster. The core idea of this paper is the use of a mixed resolution function approximation, that is, the use of a less expressive function approximation to provide useful guidance during learning, and the use of a more expressive function approximation to obtain a final result of high quality. A major question is how to combine the two representations. Two approaches are proposed and evaluated empirically.