Bounds on Sample Size for Policy Evaluation in Markov Environments
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Combining importance sampling and temporal difference control variates to simulate Markov Chains
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Reinforcement Learning with Approximation Spaces
Fundamenta Informaticae
Approximation spaces in off-policy Monte Carlo learning
Engineering Applications of Artificial Intelligence
Learning state-action basis functions for hierarchical MDPs
Proceedings of the 24th international conference on Machine learning
Reinforcement learning in the presence of rare events
Proceedings of the 25th international conference on Machine learning
Geodesic Gaussian kernels for value function approximation
Autonomous Robots
Efficient Sample Reuse in EM-Based Policy Search
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Adaptive importance sampling with automatic model selection in value function approximation
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Building portable options: skill transfer in reinforcement learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Least absolute policy iteration for robust value function approximation
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
A contextual-bandit approach to personalized news article recommendation
Proceedings of the 19th international conference on World wide web
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms
Proceedings of the fourth ACM international conference on Web search and data mining
Journal of Artificial Intelligence Research
Reinforcement learning with partially known world dynamics
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Policy improvement for POMDPs using normalized importance sampling
UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Recursive least-squares learning with eligibility traces
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Transfer in reinforcement learning via shared features
The Journal of Machine Learning Research
Reinforcement Learning with Approximation Spaces
Fundamenta Informaticae
Estimating interleaved comparison outcomes from historical click data
Proceedings of the 21st ACM international conference on Information and knowledge management
Reusing historical interaction data for faster online learning to rank for IR
Proceedings of the sixth ACM international conference on Web search and data mining
Efficient sample reuse in policy gradients with parameter-based exploration
Neural Computation
Learning exploration strategies in model-based reinforcement learning
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |