Model selection in markovian processes

Authors:
Assaf Hallak;Dotan Di-Castro;Shie Mannor
Affiliations:
Technion, Haifa, Israel;Technion, Haifa, Israel;Technion, Haifa, Israel
Venue:
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2013

Citing 12
Cited 0

A model for reasoning about persistence and causation

Computational Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Sequential cost-sensitive decision making with reinforcement learning

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Equivalence notions and model minimization in Markov decision processes

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Reinforcement learning with selective perception and hidden state

Reinforcement learning with selective perception and hidden state
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Dynamic Catalog Mailing Policies

Management Science
State abstraction discovery from irrelevant state variables

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Learning Representation and Control in Markov Decision Processes

Learning Representation and Control in Markov Decision Processes
Model selection in reinforcement learning

Machine Learning
Paper: Modeling by shortest data description

Automatica (Journal of IFAC)

Quantified Score

Hi-index	0.00

Visualization

Abstract

When analyzing data that originated from a dynamical system, a common practice is to encompass the problem in the well known frameworks of Markov Decision Processes (MDPs) and Reinforcement Learning (RL). The state space in these solutions is usually chosen in some heuristic fashion and the formed MDP can then be used to simulate and predict data, as well as indicate the best possible action in each state. The model chosen to characterize the data affects the complexity and accuracy of any further action we may wish to apply, yet few methods that rely on the dynamic structure to select such a model were suggested. In this work we address the problem of how to use time series data to choose from a finite set of candidate discrete state spaces, where these spaces are constructed by a domain expert. We formalize the notion of model selection consistency in the proposed setup. We then discuss the difference between our proposed framework and the classical Maximum Likelihood (ML) framework, and give an example where ML fails. Afterwards, we suggest alternative selection criteria and show them to be weakly consistent. We then define weak consistency for a model construction algorithm and show a simple algorithm that is weakly consistent. Finally, we test the performance of the suggested criteria and algorithm on both simulated and real world data.