Dynamic programming: deterministic and stochastic models
Dynamic programming: deterministic and stochastic models
Reinforcement learning algorithms for average-payoff Markovian decision processes
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Average reward reinforcement learning: foundations, algorithms, and empirical results
Machine Learning - Special issue on reinforcement learning
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A Bayesian approach to on-line learning
On-line learning in neural networks
Text compression as a test for artificial intelligence
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Causality: models, reasoning, and inference
Causality: models, reasoning, and inference
The Art of Causal Conjecture
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Self-Optimizing and Pareto-Optimal Policies in General Environments Based on Bayes-Mixtures
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Optimal learning: computational procedures for bayes-adaptive markov decision processes
Optimal learning: computational procedures for bayes-adaptive markov decision processes
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Optimality of universal Bayesian sequence prediction for general loss and alphabet
The Journal of Machine Learning Research
Information Theory, Inference & Learning Algorithms
Information Theory, Inference & Learning Algorithms
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
MOSAIC Model for Sensorimotor Learning and Control
Neural Computation
Prediction, Learning, and Games
Prediction, Learning, and Games
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
Probabilistic Inference for Fast Learning in Control
Recent Advances in Reinforcement Learning
Artificial Intelligence: A Modern Approach
Artificial Intelligence: A Modern Approach
Model based Bayesian exploration
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Defensive universal learning with experts
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Information, utility and bounded rationality
AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Reinforcement learning and the Bayesian control rule
AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Hi-index | 0.00 |
This paper proposes a method to construct an adaptive agent that is universal with respect to a given class of experts, where each expert is designed specifically for a particular environment. This adaptive control problem is formalized as the problem of minimizing the relative entropy of the adaptive agent from the expert that is most suitable for the unknown environment. If the agent is a passive observer, then the optimal solution is the well-known Bayesian predictor. However, if the agent is active, then its past actions need to be treated as causal interventions on the I/O stream rather than normal probability conditions. Here it is shown that the solution to this new variational problem is given by a stochastic controller called the Bayesian control rule, which implements adaptive behavior as a mixture of experts. Furthermore, it is shown that under mild assumptions, the Bayesian control rule converges to the control law of the most suitable expert.