Automatic induction of bellman-error features for probabilistic planning

Authors:
Jia-Hong Wu;Robert Givan
Affiliations:
Electrical and Computer Engineering, Purdue University, W. Lafayette, IN;Electrical and Computer Engineering, Purdue University, W. Lafayette, IN
Venue:
Journal of Artificial Intelligence Research
Year:
2010

Citing 47
Cited 1

Inductive logic programming

New Generation Computing - Selected papers from the international workshop on algorithmic learning theory,1990
The cascade-correlation learning architecture

Advances in neural information processing systems 2
C4.5: programs for machine learning

C4.5: programs for machine learning
Temporal difference learning and TD-Gammon

Communications of the ACM
Failure driven dynamic search control for partial order planners: an explanation based approach

Artificial Intelligence
First Order Regression

Machine Learning - special issue on inductive logic programming
Inferring state constraints for domain-independent planning

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning action strategies for planning domains

Artificial Intelligence
Using temporal logics to express search control knowledge for planning

Artificial Intelligence
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Relational reinforcement learning

Machine Learning - Special issue on inducive logic programming
Dynamic Programming and Optimal Control

Dynamic Programming and Optimal Control
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Knowledge-Based Systems in Artificial Intelligence: 2 Case Studies

Knowledge-Based Systems in Artificial Intelligence: 2 Case Studies
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Handling Real Numbers in ILP: A Step Towards Better Behavioural Clones (Extended Abstract)

ECML '95 Proceedings of the 8th European Conference on Machine Learning
Least-Squares Methods in Reinforcement Learning for Control

SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
Greedy linear value-approximation for factored Markov decision processes

Eighteenth national conference on Artificial intelligence
Optimal implementation of conjunctive queries in relational data bases

STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing
Relative Value Function Approximation TITLE2:

Relative Value Function Approximation TITLE2:
Learning Generalized Policies from Planning Examples Using Concept Languages

Applied Intelligence
Bellman goes relational

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Integrating Guidance into Relational Reinforcement Learning

Machine Learning
Exploiting first-order regression in inductive policy selection

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Automatic basis function construction for approximate dynamic programming and reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Graph kernels and Gaussian processes for relational reinforcement learning

Machine Learning
Learning tetris using the noisy cross-entropy method

Neural Computation
Analyzing feature generation for value-function approximation

Proceedings of the 24th international conference on Machine learning
Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

The Journal of Machine Learning Research
Non-parametric policy gradients: a unified treatment of propositional and relational domains

Proceedings of the 25th international conference on Machine learning
An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Proceedings of the 25th international conference on Machine learning
Practical solution techniques for first-order MDPs

Artificial Intelligence
Learning to improve both efficiency and quality of planning

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
The first probabilistic track of the international planning competition

Journal of Artificial Intelligence Research
Approximate policy iteration with a policy language bias: solving relational Markov decision processes

Journal of Artificial Intelligence Research
FLUCAP: a heuristic search planner for first-order MDPs

Journal of Artificial Intelligence Research
Learning first-order definitions of functions

Journal of Artificial Intelligence Research
Learning first-order definitions of functions

Journal of Artificial Intelligence Research
The automatic inference of state invariants in TIM

Journal of Artificial Intelligence Research
An analysis of Laplacian methods for value function approximation in MDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Max-norm projections for factored MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Symbolic dynamic programming for first-order MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Top-down induction of first-order logical decision trees

Artificial Intelligence
Inductive policy selection for first-order MDPs

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Feature-Discovering approximate value iteration methods

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation

Stochastic enforced hill-climbing

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.01

Visualization

Abstract

Domain-specific features are important in representing problem structure throughout machine learning and decision-theoretic planning. In planning, once state features are provided, domain-independent algorithms such as approximate value iteration can learn weighted combinations of those features that often perform well as heuristic estimates of state value (e.g., distance to the goal). Successful applications in real-world domains often require features crafted by human experts. Here, we propose automatic processes for learning useful domain-specific feature sets with little or no human intervention. Our methods select and add features that describe state-space regions of high inconsistency in the Bellman equation (statewise Bellman error) during approximate value iteration. Our method can be applied using any real-valued-feature hypothesis space and corresponding learning method for selecting features from training sets of state-value pairs. We evaluate the method with hypothesis spaces defined by both relational and propositional feature languages, using nine probabilistic planning domains. We show that approximate value iteration using a relational feature space performs at the state-of-the-art in domain-independent stochastic relational planning. Our method provides the first domain-independent approach that plays Tetris successfully (without human-engineered features).