Generalizing plans to new environments in relational MDPs

Authors:
Carlos Guestrin;Daphne Koller;Chris Gearhart;Neal Kanodia
Affiliations:
Computer Science Department, Stanford University;Computer Science Department, Stanford University;Computer Science Department, Stanford University;Computer Science Department, Stanford University
Venue:
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Year:
2003

Citing 7
Cited 46

Probabilistic frame-based systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning action strategies for planning domains

Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Context-specific multiagent coordination and planning with factored MDPs

Eighteenth national conference on Artificial intelligence
Learning Generalized Policies from Planning Examples Using Concept Languages

Applied Intelligence
On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming

Mathematics of Operations Research
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research

Bellman goes relational

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Multi-Agent Planning in Complex Uncertain Environments

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Exploiting first-order regression in inductive policy selection

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Behavior transfer for value-function-based reinforcement learning

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games

Autonomous Agents and Multi-Agent Systems
Relational temporal difference learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Dimensions of complexity of intelligent agents

PCAR '06 Proceedings of the 2006 international symposium on Practical cognitive agents and robots
Knowledge acquisition for adaptive game AI

Science of Computer Programming
Transfer via inter-task mappings in policy search reinforcement learning

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
An object-oriented representation for efficient reinforcement learning

Proceedings of the 25th international conference on Machine learning
Transfer in variable-reward hierarchical reinforcement learning

Machine Learning
Practical solution techniques for first-order MDPs

Artificial Intelligence
Factored value iteration converges

Acta Cybernetica
Factored temporal difference learning in the new ties environment

Acta Cybernetica
Measuring the effects of preprocessing decisions and network forces in dynamic network analysis

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
An Inductive Logic Programming Approach to Statistical Relational Learning

Proceedings of the 2005 conference on An Inductive Logic Programming Approach to Statistical Relational Learning
Value-function-based transfer for reinforcement learning using structure mapping

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Parallel Algorithms for Solving Markov Decision Process

ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
Value functions for RL-based behavior transfer: a comparative study

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Improving action selection in MDP's via knowledge transfer

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Automatically acquiring domain knowledge for adaptive game AI using evolutionary learning

IAAI'05 Proceedings of the 17th conference on Innovative applications of artificial intelligence - Volume 3
An integrated agent for playing real-time strategy games

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Approximate policy iteration with a policy language bias: solving relational Markov decision processes

Journal of Artificial Intelligence Research
Solving factored MDPs with hybrid state and action variables

Journal of Artificial Intelligence Research
First order decision diagrams for relational MDPs

Journal of Artificial Intelligence Research
First order decision diagrams for relational MDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Team programming in Golog under partial observability

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Efficient Bayesian task-level transfer learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Concurrent hierarchical reinforcement learning

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
ReTrASE: integrating paradigms for approximate probabilistic planning

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Discovering Abstract Concepts to Aid Cross-Map Transfer for a Learning Agent

DS '09 Proceedings of the 12th International Conference on Discovery Science
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Game-theoretic agent programming in Golog under partial observability

KI'06 Proceedings of the 29th annual German conference on Artificial intelligence
Finding and transferring policies using stored behaviors

Autonomous Robots
Transfer learning via relational templates

ILP'09 Proceedings of the 19th international conference on Inductive logic programming
Probabilistic relational planning with first order decision diagrams

Journal of Artificial Intelligence Research
Game-theoretic reasoning about actions in nonmonotonic causal theories

LPNMR'05 Proceedings of the 8th international conference on Logic Programming and Nonmonotonic Reasoning
Unique state and automatical action abstracting based on logical MDPs with negation

ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part II
Learning to win: case-based plan selection in a real-time strategy game

ICCBR'05 Proceedings of the 6th international conference on Case-Based Reasoning Research and Development
Guiding inference through relational reinforcement learning

ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
Plan b: uncertainty/time trade-offs for linear and integer programming

CPAIOR'06 Proceedings of the Third international conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems
Discovering hidden structure in factored MDPs

Artificial Intelligence
Coordination guided reinforcement learning

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Object focused q-learning for autonomous agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Distributed relational temporal difference learning

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Exploration in relational domains for model-based reinforcement learning

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

A longstanding goal in planning research is the ability to generalize plans developed for some set of environments to a new but similar environment, with minimal or no replanning. Such generalization can both reduce planning time and allow us to tackle larger domains than the ones tractable for direct planning. In this paper, we present an approach to the generalization problem based on a new framework of relational Markov Decision Processes (RMDPs). An RMDP can model a set of similar environments by representing objects as instances of different classes. In order to generalize plans to multiple environments, we define an approximate value function specified in terms of classes of objects and, in a multiagent setting, by classes of agents. This class-based approximate value function is optimized relative to a sampled subset of environments, and computed using an efficient linear programming method. We prove that a polynomial number of sampled environments suffices to achieve performance close to the performance achievable when optimizing over the entire space. Our experimental results show that our method generalizes plans successfully to new, significantly larger, environments, with minimal loss of performance relative to environment-specific planning. We demonstrate our approach on a real strategic computer war game.