On the role of tracking in stationary environments

Authors:
Richard S. Sutton;Anna Koop;David Silver
Affiliations:
University of Alberta, Edmonton, Canada;University of Alberta, Edmonton, Canada;University of Alberta, Edmonton, Canada
Venue:
Proceedings of the 24th international conference on Machine learning
Year:
2007

Citing 4
Cited 8

Temporal difference learning and TD-Gammon

Communications of the ACM
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Autonomous shaping: knowledge transfer in reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Reinforcement learning of local shape in the game of go

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Sample-based learning and search with permanent and transient memories

Proceedings of the 25th international conference on Machine learning
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Kalman temporal differences

Journal of Artificial Intelligence Research
On-line learning: where are we so far?

Ubiquitous knowledge discovery
On-line learning: where are we so far?

Ubiquitous knowledge discovery
Learning to win by reading manuals in a monte-carlo framework

Journal of Artificial Intelligence Research
Social signal and user adaptation in reinforcement learning-based dialogue management

Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
Reinforcement learning in robotics: A survey

International Journal of Robotics Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is often thought that learning algorithms that track the best solution, as opposed to converging to it, are important only on nonstationary problems. We present three results suggesting that this is not so. First we illustrate in a simple concrete example, the Black and White problem, that tracking can perform better than any converging algorithm on a stationary problem. Second, we show the same point on a larger, more realistic problem, an application of temporal difference learning to computer Go. Our third result suggests that tracking in stationary problems could be important for metalearning research (e.g., learning to learn, feature selection, transfer). We apply a metalearning algorithm for step-size adaptation, IDBD (Sutton, 1992a), to the Black and White problem, showing that meta-learning has a dramatic long-term effect on performance whereas, on an analogous converging problem, meta-learning has only a small second-order effect. This small result suggests a way of eventually overcoming a major obstacle to meta-learning research: the lack of an independent methodology for task selection.