Simulation Studies of Multi-armed Bandits with Covariates (Invited Paper)

Authors:
Nicos G. Pavlidis;Dimitris K. Tasoulis;David J. Hand
Affiliations:
-;-;-
Venue:
UKSIM '08 Proceedings of the Tenth International Conference on Computer Modeling and Simulation
Year:
2008

Citing 0
Cited 3

Dynamic Multi-Armed Bandit with Covariates

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
A contextual-bandit approach to personalized news article recommendation

Proceedings of the 19th international conference on World wide web
Optimistic Bayesian sampling in contextual-bandit problems

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We evaluate the performance of a number of action-selection methods on the multi-armed bandit problem with covariates. We resort to simulations because our primary concern is the speed with which the different methods identify the optimal policy, and not their asymptotic behaviour. The experimental results show that the performance of the ε-greedy methods is robust, while the interval estimation strategies achieve the fastest learning of the optimal policy. We propose a metric to quantify the difficulty of a multi-armed bandit problem with covariates and show that there is a trade-off between the satisfaction of the different performance measures.