Counterfactual reasoning and learning systems: the example of computational advertising

Authors:
Léon Bottou;Jonas Peters;Joaquin Quiñonero-Candela;Denis X. Charles;D. Max Chickering;Elon Portugaly;Dipankar Ray;Patrice Simard;Ed Snelson
Affiliations:
Microsoft, Redmond, WA;ETH Zürich, Zürich, Switzerland and Max Planck Institute, Tübingen, Germany;Facebook, Menlo Park, CA and Microsoft, Redmond, WA;Microsoft, Redmond, WA;Microsoft, Redmond, WA;Microsoft, Redmond, WA;Microsoft, Redmond, WA;Microsoft, Redmond, WA;Microsoft, Redmond, WA
Venue:
The Journal of Machine Learning Research
Year:
2013

Citing 16
Cited 0

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Machine Learning
Likelilood ratio gradient estimation: an overview

WSC '87 Proceedings of the 19th conference on Winter simulation
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Caveats for causal reasoning with equilibrium models

Caveats for causal reasoning with equilibrium models
Cybernetics, Second Edition: or the Control and Communication in the Animal and the Machine

Cybernetics, Second Edition: or the Control and Communication in the Animal and the Machine
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Covariate Shift Adaptation by Importance Weighted Cross Validation

The Journal of Machine Learning Research
Tuning Bandit Algorithms in Stochastic Environments

ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Controlled experiments on the web: survey and practical guide

Data Mining and Knowledge Discovery
Causality: Models, Reasoning and Inference

Causality: Models, Reasoning and Inference
A contextual-bandit approach to personalized news article recommendation

Proceedings of the 19th international conference on World wide web
Overlapping experiment infrastructure: more, better, faster experimentation

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms

Proceedings of the fourth ACM international conference on Web search and data mining
Multi-armed bandit algorithms and empirical evaluation

ECML'05 Proceedings of the 16th European conference on Machine Learning
Efficient ranking in sponsored search

WINE'11 Proceedings of the 7th international conference on Internet and Network Economics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select the changes that would have improved the system performance. This work is illustrated by experiments on the ad placement system associated with the Bing search engine.