Bayesian adaptive stochastic process termination
Mathematics of Operations Research
ASYMPTOTIC BAYES ANALYSIS FOR THE FINITE-HORIZON ONE-ARMED-BANDIT PROBLEM
Probability in the Engineering and Informational Sciences
Online Regret Bounds for Markov Decision Processes with Deterministic Transitions
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Bounded parameter Markov decision processes with average reward criterion
COLT'07 Proceedings of the 20th annual conference on Learning theory
REGAL: a regularization based algorithm for reinforcement learning in weakly communicating MDPs
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Online regret bounds for Markov decision processes with deterministic transitions
Theoretical Computer Science
Near-optimal Regret Bounds for Reinforcement Learning
The Journal of Machine Learning Research
Adaptive control of constrained finite Markov chains
Automatica (Journal of IFAC)
Action Time Sharing Policies for Ergodic Control of Markov Chains
SIAM Journal on Control and Optimization
Hi-index | 0.00 |