Elements of information theory
Elements of information theory
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Simulated Annealing: A Proof of Convergence
IEEE Transactions on Pattern Analysis and Machine Intelligence
On the undecidability of probabilistic planning and related stochastic optimization problems
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Equivalence notions and model minimization in Markov decision processes
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Reinforcement learning with selective perception and hidden state
Reinforcement learning with selective perception and hidden state
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
Predictive state representations: a new theory for modeling dynamical systems
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Statistical and Inductive Inference by Minimum Message Length (Information Science and Statistics)
Statistical and Inductive Inference by Minimum Message Length (Information Science and Statistics)
Probabilistic Finite-State Machines-Part I
IEEE Transactions on Pattern Analysis and Machine Intelligence
Stochastic Optimization (Scientific Computation)
Stochastic Optimization (Scientific Computation)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
Monte Carlo Strategies in Scientific Computing
Monte Carlo Strategies in Scientific Computing
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Universal reinforcement learning
IEEE Transactions on Information Theory
Reinforcement learning with perceptual aliasing: the perceptual distinctions approach
AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Consistency of feature Markov processes
ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
A Monte-Carlo AIXI approximation
Journal of Artificial Intelligence Research
Bandit based monte-carlo planning
ECML'06 Proceedings of the 17th European conference on Machine Learning
A universal data compression system
IEEE Transactions on Information Theory
The context-tree weighting method: basic properties
IEEE Transactions on Information Theory
Hi-index | 0.00 |
Following a recent surge in using history-based methods for resolving perceptual aliasing in reinforcement learning, we introduce an algorithm based on the feature reinforcement learning framework called ΦMDP [13]. To create a practical algorithm we devise a stochastic search procedure for a class of context trees based on parallel tempering and a specialized proposal distribution. We provide the first empirical evaluation for ΦMDP. Our proposed algorithm achieves superior performance to the classical U-tree algorithm [20] and the recent active-LZ algorithm [6], and is competitive with MC-AIXI-CTW [29] that maintains a bayesian mixture over all context trees up to a chosen depth. We are encouraged by our ability to compete with this sophisticated method using an algorithm that simply picks one single model, and uses Q-learning on the corresponding MDP. Our ΦMDP algorithm is simpler and consumes less time and memory. These results show promise for our future work on attacking more complex and larger problems.