Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Gambling in a rigged casino: The adversarial multi-armed bandit problem
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Hi-index | 0.00 |
Despite much progress, state-of-the-art learning algorithms for repeated games still often require thousands of moves to learn effectively -- even in simple games. Our goal is to find algorithms that learn to play effective strategies in tens of moves in many games when paired against various associates. Toward this end, we describe a new meta-algorithm designed to increase the learning speed and proficiency of expert algorithms. We show that this meta-algorithm enhances four expert algorithms so that they quickly learn effective strategies in two-player repeated games.