State abstraction for programmable reinforcement learning agents

Authors:
David Andre;Stuart J. Russell
Affiliations:
Computer Science Division, UC Berkeley, CA;Computer Science Division, UC Berkeley, CA
Venue:
Eighteenth national conference on Artificial intelligence
Year:
2002

Citing 11
Cited 27

Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Multi-time models for temporally abstract planning

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Hierarchical multi-agent reinforcement learning

Proceedings of the fifth international conference on Autonomous agents
Hierarchically Optimal Average Reward Reinforcement Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Decision-Theoretic, High-Level Agent Programming in the Situation Calculus

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Dynamic Programming

Dynamic Programming
State Abstraction for Programmable Reinforcement Learning Agents

State Abstraction for Programmable Reinforcement Learning Agents
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty

Artificial Intelligence Review
Behavior transfer for value-function-based reinforcement learning

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Automatic shaping and decomposition of reward functions

Proceedings of the 24th international conference on Machine learning
Automatic discovery and transfer of MAXQ hierarchies

Proceedings of the 25th international conference on Machine learning
Hierarchical Average Reward Reinforcement Learning

The Journal of Machine Learning Research
Towards adaptive programming: integrating reinforcement learning into a programming language

Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Transfer in variable-reward hierarchical reinforcement learning

Machine Learning
Adaptive Multi-Agent Programming in GTGolog

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Value functions for RL-based behavior transfer: a comparative study

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Economic hierarchical Q-learning

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Behavior bounding: an efficient method for high-level behavior comparison

Journal of Artificial Intelligence Research
Hierarchical heuristic forward search in Stochastic domains

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Concurrent hierarchical reinforcement learning

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Adaptive multi-agent programming in GTGolog

KI'06 Proceedings of the 29th annual German conference on Artificial intelligence
The independent choice logic and beyond

Probabilistic inductive logic programming
Smooth interpretation

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Adaptation-based programming in java

Proceedings of the 20th ACM SIGPLAN workshop on Partial evaluation and program manipulation
Synthesis of first-order dynamic programming algorithms

Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Policy-contingent abstraction for robust robot control

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Coordination guided reinforcement learning

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Faster program adaptation through reward attribution inference

Proceedings of the 11th International Conference on Generative Programming and Component Engineering
Learning-Based test programming for programmers

ISoLA'12 Proceedings of the 5th international conference on Leveraging Applications of Formal Methods, Verification and Validation: technologies for mastering change - Volume Part I
A hierarchical representation policy iteration algorithm for reinforcement learning

IScIDE'12 Proceedings of the third Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
Reinforcement learning algorithms with function approximation: Recent advances and applications

Information Sciences: an International Journal
The actor's view of automated planning and acting: A position paper

Artificial Intelligence
Hierarchical Social Network Analysis Using a Multi-Agent System: A School System Case

International Journal of Agent Technologies and Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Safe state abstraction in reinforcement learning allows an agent to ignore aspects of its current state that are irrelevant to its current decision, and therefore speeds up dynamic programming and learning. This paper explores safe state abstraction in hierarchical reinforcement learning, where learned behaviors must conform to a given partial, hierarchical program. Unlike previous approaches to this problem, our methods yield significant state abstraction while maintaining hierarchical optimality, i.e., optimality among all policies consistent with the partial program. We show how to achieve this for a partial programming language that is essentially Lisp augmented with nondeterministic constructs. We demonstrate our methods on two variants of Dietterich's taxi domain, showing how state abstraction and hierarchical optimality result in faster learning of better policies and enable the transfer of learned skills from one problem to another.