Elements of information theory
Elements of information theory
Optimal prefetching via data compression
Journal of the ACM (JACM)
Near-Optimal Reinforcement Learning in Polynominal Time
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Combining expert advice in reactive environments
Journal of the ACM (JACM)
Reinforcement learning in POMDPs without resets
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
IEEE Transactions on Information Theory
A universal predictor based on pattern matching
IEEE Transactions on Information Theory
On sequential strategies for loss functions with memory
IEEE Transactions on Information Theory
On the Structure of Optimal Real-Time Encoders and Decoders in Noisy Communication
IEEE Transactions on Information Theory
The context-tree weighting method: basic properties
IEEE Transactions on Information Theory
A Monte-Carlo AIXI approximation
Journal of Artificial Intelligence Research
Feature reinforcement learning in practice
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Hi-index | 754.84 |
We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can influence future observations and costs. The goal is to minimize the long-term average cost. We propose a novel algorithm, known as the active LZ algorithm, for optimal control based on ideas from the Lempel-Ziv scheme for universal data compression and prediction. We establish that, under the active LZ algorithm, if there exists an integer ?? such that the future is conditionally independent of the past given a window of ?? consecutive actions and observations, then the average cost converges to the optimum. Experimental results involving the game of Rock-Paper-Scissors illustrate merits of the algorithm.