Efficiently exploiting symmetries in real time dynamic programming

Authors:
Shravan Matthur Narayanamurthy;Balaraman Ravindran
Affiliations:
Department of Computer Science and Engineering, Indian Institute of Technology Madras;Department of Computer Science and Engineering, Indian Institute of Technology Madras
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 13
Cited 1

Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Symmetry and model checking

Formal Methods in System Design - Special issue on symmetry in automatic verification
Coalition structure generation with worst case guarantees

Artificial Intelligence
Computers, Chess, and Cognition

Computers, Chess, and Cognition
Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Model Minimization in Hierarchical Reinforcement Learning

Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Equivalence notions and model minimization in Markov decision processes

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Settling the Complexity of Two-Player Nash Equilibrium

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
On strictly competitive multi-player games

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Competitive safety analysis: robust decision-making in multi-agent systems

Journal of Artificial Intelligence Research
SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Worst-case equilibria

STACS'99 Proceedings of the 16th annual conference on Theoretical aspects of computer science

On the Hardness and Existence of Quasi-Strict Equilibria

SAGT '08 Proceedings of the 1st International Symposium on Algorithmic Game Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current approaches to solving Markov Decision Processes (MDPs) are sensitive to the size of the MDP. When applied to real world problems though, MDPs exhibit considerable implicit redundancy, especially in the form of symmetries. Existing model minimization methods do not exploit this redundancy due to symmetries well. In this work, given such symmetries, we present a time-efficient algorithm to construct a functionally equivalent reduced model of the MDP. Further, we present a Real Time Dynamic Programming (RTDP) algorithm which obviates an explicit construction of the reduced model by integrating the given symmetries into it. The RTDP algorithm solves the reduced model, while working with parameters of the original model and the given symmetries. As RTDP uses its experience to determine which states to backup, it focuses on parts of the reduced state set that are most relevant. This results in significantly faster learning and a reduced overall execution time. The algorithms proposed are particularly effective in the case of structured automorphisms even when the reduced model does not have fewer features. We demonstrate the results empirically on several domains.