Explanation-Based Learning and Reinforcement Learning: A Unified View

Authors:
Thomas G. Dietterich;Nicholas S. Flann
Affiliations:
Department of Computer Science, Oregon State University, Corvallis, OR 97331-3202. E-mail: tgd@cs.orst.edu;Department of Computer Science, Utah State University, Logan, UT 84322-4205. E-mail: flann@nick.cs.usu.edu
Venue:
Machine Learning
Year:
1997

Citing 20
Cited 10

Quantitative results concerning the utility of explanation-based learning

Artificial Intelligence
Hidden surface removal for rectangles

Journal of Computer and System Sciences
Introduction to algorithms

Introduction to algorithms
The Problem of Expensive Chunks and its Solution by Restricting Expressiveness

Machine Learning
Is learning rate a good performance criterion for learning?

Proceedings of the seventh international conference (1990) on Machine learning
Using local models to control movement

Advances in neural information processing systems 2
Practical Issues in Temporal Difference Learning

Machine Learning
Technical Note: \cal Q-Learning

Machine Learning
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching

Machine Learning
Correct abstraction in counter-planning: a knowledge compilation approach

Correct abstraction in counter-planning: a knowledge compilation approach
Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time

Machine Learning
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Chunking in Soar: The Anatomy of a General Learning Mechanism

Machine Learning
Explanation-Based Generalization: A Unifying View

Machine Learning
Learning to Predict in Uncertain Continuous Tasks

ML '92 Proceedings of the Ninth International Workshop on Machine Learning
Dynamic Programming

Dynamic Programming
Learning effective search control knowledge: an explanation-based approach

Learning effective search control knowledge: an explanation-based approach
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
A reinforcement learning approach to job-shop scheduling

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

On verifying game designs and playing strategies using reinforcement learning

Proceedings of the 2001 ACM symposium on Applied computing
Learning to play strong poker

Machines that learn to play games
Software as Learning: Quality Factors and Life-Cycle Revised

FASE '00 Proceedings of the Third Internationsl Conference on Fundamental Approaches to Software Engineering: Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS 2000
Learning-assisted automated planning: looking back, taking stock, going forward

AI Magazine
Bellman goes relational

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Dynamic programming for structured continuous Markov decision problems

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Player Co-Modelling in a Strategy Board Game: Discovering How to Play Fast

Cybernetics and Systems
An Inductive Logic Programming Approach to Statistical Relational Learning

Proceedings of the 2005 conference on An Inductive Logic Programming Approach to Statistical Relational Learning
Intensional dynamic programming. A Rosetta stone for structured dynamic programming

Journal of Algorithms
Guiding inference through relational reinforcement learning

ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

In speedup-learning problems, where full descriptions of operatorsare known, both explanation-based learning (EBL) and reinforcementlearning (RL) methods can be applied. This paper shows that bothmethods involve fundamentally the same process of propagatinginformation backward from the goal toward the starting state. MostRL methods perform this propagation on a state-by-state basis, whileEBL methods compute the weakest preconditions of operators, andhence, perform this propagation on a region-by-region basis. Barto,Bradtke, and Singh (1995) have observed thatmany algorithms for reinforcement learning can be viewed asasynchronous dynamic programming. Based on this observation, thispaper shows how to develop dynamic programming versions of EBL, whichwe call region-based dynamic programming or Explanation-BasedReinforcement Learning (EBRL). The paper compares batch andonline versions of EBRL to batch and online versions of point-baseddynamic programming and to standard EBL. The results show thatregion-based dynamic programming combines the strengths of EBL (fastlearning and the ability to scale to large state spaces) with thestrengths of reinforcement learning algorithms (learning of optimalpolicies). Results are shown in chess endgames and in synthetic mazetasks.