Patching approximate solutions in reinforcement learning

Authors:
Min Sub Kim;William Uther
Affiliations:
ARC Centre of Excellence for Autonomous Systems, School of Computer Science and Engineering, University of New South Wales, Sydney, NSW, Australia;National ICT Australia, Sydney, NSW, Australia
Venue:
ECML'06 Proceedings of the 17th European conference on Machine Learning
Year:
2006

Citing 7
Cited 0

Real-time heuristic search

Artificial Intelligence
Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Sparse cooperative Q-learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Behavior transfer for value-function-based reinforcement learning

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Learning to act using real-time dynamic programming

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces an approach to improving an approximate solution in reinforcement learning by augmenting it with a small overriding patch. Many approximate solutions are smaller and easier to produce than a flat solution, but the best solution within the constraints of the approximation may fall well short of global optimality. We present a technique for efficiently learning a small patch to reduce this gap. Empirical evaluation demonstrates the effectiveness of patching, producing combined solutions that are much closer to global optimality.