Likelilood ratio gradient estimation: an overview
WSC '87 Proceedings of the 19th conference on Winter simulation
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Caveats for causal reasoning with equilibrium models
Caveats for causal reasoning with equilibrium models
Cybernetics, Second Edition: or the Control and Communication in the Animal and the Machine
Cybernetics, Second Edition: or the Control and Communication in the Animal and the Machine
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Covariate Shift Adaptation by Importance Weighted Cross Validation
The Journal of Machine Learning Research
Tuning Bandit Algorithms in Stochastic Environments
ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Controlled experiments on the web: survey and practical guide
Data Mining and Knowledge Discovery
Causality: Models, Reasoning and Inference
Causality: Models, Reasoning and Inference
A contextual-bandit approach to personalized news article recommendation
Proceedings of the 19th international conference on World wide web
Overlapping experiment infrastructure: more, better, faster experimentation
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms
Proceedings of the fourth ACM international conference on Web search and data mining
Multi-armed bandit algorithms and empirical evaluation
ECML'05 Proceedings of the 16th European conference on Machine Learning
Efficient ranking in sponsored search
WINE'11 Proceedings of the 7th international conference on Internet and Network Economics
Hi-index | 0.00 |
This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select the changes that would have improved the system performance. This work is illustrated by experiments on the ad placement system associated with the Bing search engine.