A model for reasoning about persistence and causation
Computational Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Sequential cost-sensitive decision making with reinforcement learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Equivalence notions and model minimization in Markov decision processes
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Reinforcement learning with selective perception and hidden state
Reinforcement learning with selective perception and hidden state
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Dynamic Catalog Mailing Policies
Management Science
State abstraction discovery from irrelevant state variables
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Learning Representation and Control in Markov Decision Processes
Learning Representation and Control in Markov Decision Processes
Model selection in reinforcement learning
Machine Learning
Paper: Modeling by shortest data description
Automatica (Journal of IFAC)
Hi-index | 0.00 |
When analyzing data that originated from a dynamical system, a common practice is to encompass the problem in the well known frameworks of Markov Decision Processes (MDPs) and Reinforcement Learning (RL). The state space in these solutions is usually chosen in some heuristic fashion and the formed MDP can then be used to simulate and predict data, as well as indicate the best possible action in each state. The model chosen to characterize the data affects the complexity and accuracy of any further action we may wish to apply, yet few methods that rely on the dynamic structure to select such a model were suggested. In this work we address the problem of how to use time series data to choose from a finite set of candidate discrete state spaces, where these spaces are constructed by a domain expert. We formalize the notion of model selection consistency in the proposed setup. We then discuss the difference between our proposed framework and the classical Maximum Likelihood (ML) framework, and give an example where ML fails. Afterwards, we suggest alternative selection criteria and show them to be weakly consistent. We then define weak consistency for a model construction algorithm and show a simple algorithm that is weakly consistent. Finally, we test the performance of the suggested criteria and algorithm on both simulated and real world data.