SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes

Authors:
Balaraman Ravindran;Andrew G. Barto
Affiliations:
Department of Computer Science, University of Massachusetts, Amherst, MA;Department of Computer Science, University of Massachusetts, Amherst, MA
Venue:
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Year:
2003

Citing 15
Cited 18

A model for reasoning about persistence and causation

Computational Intelligence
Using abstractions for decision-theoretic planning with time constraints

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Symmetry and model checking

Formal Methods in System Design - Special issue on symmetry in automatic verification
Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Bounded-parameter Markov decision process

Artificial Intelligence
A Heuristic Approach to the Discovery of Macro-Operators

Machine Learning
Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Model Minimization in Hierarchical Reinforcement Learning

Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Equivalence notions and model minimization in Markov decision processes

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Symmetries and Model Minimization in Markov Decision Processes

Symmetries and Model Minimization in Markov Decision Processes
Algebraic structure theory of sequential machines (Prentice-Hall international series in applied mathematics)

Algebraic structure theory of sequential machines (Prentice-Hall international series in applied mathematics)
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Model minimization in Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Learning state-action basis functions for hierarchical MDPs

Proceedings of the 24th international conference on Machine learning
Reinforcement learning for problems with symmetrical restricted states

Robotics and Autonomous Systems
Learning Representation and Control in Markov Decision Processes: New Frontiers

Foundations and Trends® in Machine Learning
Decision tree methods for finding reusable MDP homomorphisms

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Compositional Models for Reinforcement Learning

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Samuel meets Amarel: automating value function approximation using global state space analysis

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Abstraction in predictive state representations

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Learning and multiagent reasoning for autonomous agents

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Deictic option schemas

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
An experts algorithm for transfer learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Efficiently exploiting symmetries in real time dynamic programming

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Efficiently exploiting symmetries in real time dynamic programming

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
State abstraction discovery from irrelevant state variables

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Learning to generalize and reuse skills using approximate partial policy homomorphisms

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Computing and using lower and upper bounds for action elimination in MDP planning

SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Finding and transferring policies using stored behaviors

Autonomous Robots
Structural abstraction experiments in reinforcement learning

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Transfer in reinforcement learning via shared features

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

To operate effectively in complex environments learning agents require the ability to selectively ignore irrelevant details and form useful abstractions. In this article we consider the question of what constitutes a useful abstraction in a stochastic sequential decision problem modeled as a semi-Markov Decision Process (SMDPs). We introduce the notion of SMDP homomorphism and argue that it provides a useful tool for a rigorous study of abstraction for SMDPs. We present an SMDP minimization framework and an abstraction framework for factored MDPs based on SMDP homomorphisms. We also model different classes of abstractions that arise in hierarchical systems. Although we use the options framework for purposes of illustration, the ideas are more generally applicable. We also show that the conditions for abstraction we employ are a generalization of earlier work by Dietterich as applied to the options framework.