Mathematics of Operations Research
Allocating Bandwidth for Bursty Connections
SIAM Journal on Computing
Stochastic Load Balancing and Related Problems
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Approximating the Stochastic Knapsack Problem: The Benefit of Adaptivity
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
Stochastic Optimization is (Almost) as easy as Deterministic Optimization
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Adaptivity and approximation for stochastic packing problems
SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
An adaptive algorithm for selecting profitable keywords for search-based advertising services
EC '06 Proceedings of the 7th ACM conference on Electronic commerce
Asking the right questions: model-driven optimization using probes
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximation algorithms for budgeted learning problems
Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Model-driven optimization using adaptive probes
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Sampling bounds for stochastic optimization
APPROX'05/RANDOM'05 Proceedings of the 8th international workshop on Approximation, Randomization and Combinatorial Optimization Problems, and Proceedings of the 9th international conference on Randamization and Computation: algorithms and techniques
Multi-armed Bandits with Metric Switching Costs
ICALP '09 Proceedings of the 36th Internatilonal Collogquium on Automata, Languages and Programming: Part II
Paradoxes in Learning and the Marginal Value of Information
Decision Analysis
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Dynamic pricing with limited supply
Proceedings of the 13th ACM Conference on Electronic Commerce
The Knowledge Gradient Algorithm for a General Class of Online Learning Problems
Operations Research
New algorithms for budgeted learning
Machine Learning
Hi-index | 0.00 |
In the budgeted learning problem, we are allowed to experiment on a set of alternatives (given a fixed experimentation budget) with the goal of picking a single alternative with the largest possible expected payoff. Constant factor approximation algorithms for this problem were developed by Guha and Munagala by rounding a linear program that couples the various alternatives together. In this paper we present an index for this problem, which we call the ratio index, which also guarantees a constant factor approximation. Index-based policies have the advantage that a single number (i.e. the index) can be computed for each alternative irrespective of all other alternatives, and the alternative with the highest index is experimented upon. This is analogous to the famous Gittins index for the discounted multi-armed bandit problem. The ratio index has several interesting structural properties. First, we show that it can be computed in strongly polynomial time. Second, we show that with the appropriate discount factor, the Gittins index and our ratio index are constant factor approximations of each other, and hence the Gittins index also gives a constant factor approximation to the budgeted learning problem. Finally, we show that the ratio index can be used to create an index-based policy that achieves an O(1)-approximation for the finite horizon version of the multi-armed bandit problem. Moreover, the policy does not require any knowledge of the horizon (whereas we compare its performance against an optimal strategy that is aware of the horizon). This yields the following surprising result: there is an index-based policy that achieves an O(1)-approximation for the multi-armed bandit problem, oblivious to the underlying discount factor.