Quantitative results concerning the utility of explanation-based learning
Artificial Intelligence
Hidden surface removal for rectangles
Journal of Computer and System Sciences
Introduction to algorithms
Is learning rate a good performance criterion for learning?
Proceedings of the seventh international conference (1990) on Machine learning
Using local models to control movement
Advances in neural information processing systems 2
Practical Issues in Temporal Difference Learning
Machine Learning
Technical Note: \cal Q-Learning
Machine Learning
Correct abstraction in counter-planning: a knowledge compilation approach
Correct abstraction in counter-planning: a knowledge compilation approach
Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Chunking in Soar: The Anatomy of a General Learning Mechanism
Machine Learning
Explanation-Based Generalization: A Unifying View
Machine Learning
Learning to Predict in Uncertain Continuous Tasks
ML '92 Proceedings of the Ninth International Workshop on Machine Learning
Dynamic Programming
Learning effective search control knowledge: an explanation-based approach
Learning effective search control knowledge: an explanation-based approach
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
A reinforcement learning approach to job-shop scheduling
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
On verifying game designs and playing strategies using reinforcement learning
Proceedings of the 2001 ACM symposium on Applied computing
Machines that learn to play games
Software as Learning: Quality Factors and Life-Cycle Revised
FASE '00 Proceedings of the Third Internationsl Conference on Fundamental Approaches to Software Engineering: Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS 2000
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Dynamic programming for structured continuous Markov decision problems
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Player Co-Modelling in a Strategy Board Game: Discovering How to Play Fast
Cybernetics and Systems
An Inductive Logic Programming Approach to Statistical Relational Learning
Proceedings of the 2005 conference on An Inductive Logic Programming Approach to Statistical Relational Learning
Intensional dynamic programming. A Rosetta stone for structured dynamic programming
Journal of Algorithms
Guiding inference through relational reinforcement learning
ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
Hi-index | 0.00 |
In speedup-learning problems, where full descriptions of operatorsare known, both explanation-based learning (EBL) and reinforcementlearning (RL) methods can be applied. This paper shows that bothmethods involve fundamentally the same process of propagatinginformation backward from the goal toward the starting state. MostRL methods perform this propagation on a state-by-state basis, whileEBL methods compute the weakest preconditions of operators, andhence, perform this propagation on a region-by-region basis. Barto,Bradtke, and Singh (1995) have observed thatmany algorithms for reinforcement learning can be viewed asasynchronous dynamic programming. Based on this observation, thispaper shows how to develop dynamic programming versions of EBL, whichwe call region-based dynamic programming or Explanation-BasedReinforcement Learning (EBRL). The paper compares batch andonline versions of EBRL to batch and online versions of point-baseddynamic programming and to standard EBL. The results show thatregion-based dynamic programming combines the strengths of EBL (fastlearning and the ability to scale to large state spaces) with thestrengths of reinforcement learning algorithms (learning of optimalpolicies). Results are shown in chess endgames and in synthetic mazetasks.